Defining Table Relationships in Databricks Genie

Understanding and defining table relationships is a best practice in data modeling, and it's particularly important when working with LLMs like Databricks Genie. While Primary and Foreign keys in Lakehouse are informational only, Genie leverages them to understand your data structure and generate better insights.

But what if you haven't registered foreign keys in your schema? Let's explore how to master relationships in Genie, both with and without formal foreign key constraints.

Using Foreign Keys in Genie

When you create tables with foreign key constraints like the example below, Genie can automatically understand the relationships in your data model:

CREATE TABLE products (
  product_id BIGINT GENERATED ALWAYS AS IDENTITY,
  sku STRING NOT NULL,
  product_name STRING NOT NULL,
  brand_id BIGINT NOT NULL,
  category_id BIGINT NOT NULL,
  supplier_id BIGINT NOT NULL,
  standard_cost DECIMAL(10,2) NOT NULL,
  list_price DECIMAL(10,2) NOT NULL,
  CONSTRAINT pk_products PRIMARY KEY (product_id),
  CONSTRAINT fk_products_brand FOREIGN KEY (brand_id) REFERENCES brands(brand_id),
  CONSTRAINT fk_products_category FOREIGN KEY (category_id) REFERENCES categories(category_id),
  CONSTRAINT fk_products_supplier FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id)
)
COMMENT 'Products with brand, category, supplier references';

Once registered, Unity Catalog displays these relationships with a helpful visual diagram, making it easy to understand your data model at a glance.

When you query Genie (through Databricks One or the Genie interface), it recognizes these foreign key relationships and uses them to provide accurate, context-aware responses about your data. It will tell us about the relations:

Defining Relationships Directly in Genie

But what about tables that aren't linked through formal foreign keys? This is common with fact tables or when working with legacy data models.

The good news is that you can define relationships directly within Genie.

To add a relationship:

  1. Navigate to the Joins section in Genie

  2. Click "Add" to create a new relationship

  3. Define the relationship type between your tables

One major advantage of defining relationships in Genie is the flexibility to specify cardinality:

“Many to One”,
“One to Many”,
“One to One”,
“Many to Many”,

This level of detail isn't currently available with traditional FOREIGN KEY constraints, giving you more precise control over how Genie interprets your data model.

The Best of Both Worlds

Once you've defined relationships both through foreign keys and directly in Genie, your AI assistant has a complete picture of your data structure. When you ask questions, Genie seamlessly uses both types of relationships to provide comprehensive answers.

This hybrid approach gives you the flexibility to work with any data model (whether it has formal constraints or not) while still getting the full benefits of Genie's AI-powered insights.

Try It Yourself

Want to experiment with this approach? The complete dataset used in this example is available on GitHub: medium/topics/202509/genie relathionship.ipynb

Hubert Dudek

Databricks MVP | Advisor to Databricks Product Board and Technical advisor to SunnyData

https://www.linkedin.com/in/hubertdudek/
Next
Next

Why VARIANT Outperforms STRING for JSON Data