Lakebase: The best of both worlds

The path to a hybrid database

What if you could have unlimited storage for processing terabytes of data and retrieve a single record in milliseconds?

Too good to be true? Enter Lakebase, Databricks' first step toward a true hybrid database that handles both analytical and transactional workloads.

While Lakebase isn't yet a complete hybrid solution (it utilizes a separate PostgreSQL engine), its deep integration with the Databricks platform seamlessly brings transactional capabilities to your analytical environment.

A brief history

Historically, we've had two separate worlds: row-based transactional databases with primary keys for fast single-record retrieval, and column-oriented analytical databases for processing large datasets.

Then Databricks revolutionized data architecture by decoupling storage from compute, leveraging cloud storage with open formats. We call it the Lakehouse, and today it has become the industry standard.

This decoupling enabled truly unlimited data processing for analytical workloads. Now, Databricks is applying the same principle to transactional databases, paving the way for genuine hybrid systems.



Sync to Lakebase and get lookups in milliseconds

Because Lakebase is a transactional database, it uses row-level storage and primary keys to retrieve individual records in milliseconds (or even faster).

The other way around is also possible

The integration works both ways: create a table in Lakebase for transactional inserts, and it's immediately accessible in Databricks through Lakehouse Federation.

Build a transactional web API in minutes

While feature serving can be intimidating for some, Lakebase makes it surprisingly simple. You can create a web API endpoint (yes, with a hosting-serving endpoint) almost instantly. Then you can obtain a lookup value in around 1 millisecond in any application, inside or outside. It’s just a few simple steps:

  1. Create a lookup function using the ready-made lookup spec

  2. Create a serving endpoint that will host code for the lookup function (one serving endpoint can handle multiple functions)

  3. Start sending web requests to the endpoint and get instant responses

That's it. You now have a production-ready API accessible from any application.

Complete app development on a single platform

Need a user interface? Databricks Apps supports both vertical and horizontal scaling and includes ready-made templates to get you started quickly.

Git-Style version control for your database

Lakebase also uses an open format (vanilla files), but what's remarkable (and what I love about it) is that you can use Git-style branching to manage the database!

Why Lakebase matters

This architecture unlocks countless possibilities within the Databricks ecosystem. Combined with straightforward, transparent billing, Lakebase removes many traditional barriers between analytical and transactional workloads.

If you haven't explored Lakebase yet, now is the time.

Hubert Dudek

Databricks MVP | Advisor to Databricks Product Board and Technical advisor to SunnyData

https://www.linkedin.com/in/hubertdudek/
Next
Next

DABs: Referencing Your Resources