End The Data Engineering Nightmare with Metrics.
End The Data Engineering Nightmare with Metrics.It's a nightmare when everyone—from analysts to data engineers—has to remember that "sales" means transaction types 1, 4, and 5, minus refunds (types 3, 6, and 7). Life would be so much easier if we could define that rule once and write something clean like:
SQL Query Simplification - Traditional vs Databricks Metrics Views Approach.
SELECT MEASURE(sales) FROM transactions # instead of SELECT SUM(amount) FROM transactions WHERE transaction_type IN (1,4,5) SELECT MEASURE(refunds) FROM transactions # instead of SELECT SUM(amount) FROM transactions WHERE transaction_type IN (3,6,7) SELECT MEASURE(netsale) FROM transactions # instead of # SELECT # SUM( # CASE # WHEN transaction_type IN (1,4,5) THEN amount -- sales # WHEN transaction_type IN (3,6,7) THEN -amount -- refunds # END # ) AS netsale # FROM transactions;
Imagine if these measures could be connected directly to our transaction table, so alongside the table, we could have predefined dimensions and measures. We could keep our code dry by eliminating repeated transaction_type IDs and having a centralized place to easily update these ID lists.
This is now possible (since SQL Warehouse 2025.16 or DBR 16.4) thanks to the metrics views syntax:
CREATE OR REPLACE VIEW .. WITH METRICS LANGUAGE YAML AS $$ version: 0.1 source: ... filter: ... dimensions: ... measures: ... $$;
Setting Up Metrics Views
Our transactions table, for which we'll define metrics, looks like this:
Let's define our metrics:
dimensions: - name: tx_id expr: tx_id - name: tx_date expr: date(tx_ts) - name: tx_type expr: CASE WHEN transaction_type IN (1,4,5) THEN 'Sale' WHEN transaction_type IN (3,6,7) THEN 'Refund' END measures: - name: sales_amount expr: SUM(amount) FILTER (WHERE transaction_type IN (1,4,5)) - name: refund_amount expr: SUM(amount) FILTER (WHERE transaction_type IN (3,6,7)) - name: net_sales expr: sales_amount - refund_amount
After creating the METRICS VIEW, these metrics become visible to everyone in Unity Catalog:
Databricks Unity Catalog Metrics View - Transaction Analytics Dashboard
Dimensions can be used as normal fields, but measures need to be wrapped in the MEASURE() function.
Using Metrics Views in Practice
Get total sales (total for transaction_type IN (1,4,5)):
SELECT MEASURE(sales_amount) FROM transaction_metrics # MEASURE sales_amount = SUM(amount) FILTER (WHERE transaction_type IN (1,4,5)) # output: 410.0
Get total refunds (total for transaction_type IN (3, 6, 7)):
SELECT MEASURE(refund_amount) FROM transaction_metrics # MEASURE refund_amount = SUM(amount) FILTER (WHERE transaction_type IN (3,6,7) # output: 80.0
Calculate net sales (sales amount minus refund amount):
SELECT MEASURE(net_sales) FROM transaction_metrics # MEASURE net_sales = sales_amount - refund_amount # output: 333.0
Net sales by day with dimension grouping:
SELECT tx_date, MEASURE(net_sales) FROM transaction_metrics GROUP BY tx_date # MEASURE net_sales = sales_amount - refund_amount # DIMENSION tx_date = date(tx_ts) # # output: # tx_date measure(net_sales) # 2025-05-01 70.00 # 2025-05-02 50.00 # 2025-05-03 70.00 # 2025-05-04 80.00 # 2025-05-05 60.00
All refund IDs with filter (without WHERE, we would get the same numbers as measure include FILTER - using WHERE prevents us from seeing other IDs with zero or NULL values) :
SELECT tx_id, MEASURE(refund_amount) FROM transaction_metrics WHERE tx_type = 'Refund' GROUP BY tx_id # MEASURE refund_amount = SUM(amount) FILTER (WHERE transaction_type IN (3,6,7) # DIMENSION tx_id = tx_id # DIMENSION tx_type = CASE # WHEN transaction_type IN (1,4,5) THEN 'Sale' # WHEN transaction_type IN (3,6,7) THEN 'Refund' # END # # output: 410.0 # tx_id measure(refund_amount) # 2 30.00 # 4 10.00 # 6 40.00
The Cultural Shift
Making metrics visible to the business through Unity Catalog is a huge win, but the metrics view syntax may feel unfamiliar to many analysts—saved query snippets will likely feel more natural at first. For data engineers, however, metrics views are transformative: they centralize business rules, keep code DRY, and cleanly separate transformation logic from business definitions (like those transaction-type rules).
With metrics views, you can build robust, SQL-centric data engineering frameworks. They also lend themselves well to testing—it's easy to write validation checks that ensure SQL queries built on metrics views return expected results.
Complete Feature Set
According to the documentation, here are all available metrics options:
Version: Defaults to 0.1. This is the version of the metric view specification.
Source: The source data for the metric view. This can be a table-like asset or a SQL query.
Joins: Optional. Used to LEFT JOIN the fact table defined under the source with dimension tables as a star schema model.
Filter: Optional. A SQL boolean expression that applies to all queries; equivalent to the WHERE clause.
Dimensions: An array of dimension definitions, including the dimension name and expression.
Measures: An array of aggregate expression columns.
Future Possibilities
The ability to edit metrics directly in Unity Catalog after creation is a nice touch. Looking ahead, it would be helpful to add UI components (like dropdowns) for editing metrics and establish streamlined CI/CD processes to version and certify metrics—though this can certainly be accomplished today with proper design.