Tutorial: Build a complete metric view with joins
This tutorial walks you through building a comprehensive sales analytics metric view using the TPC-H dataset. By the end, you'll have a metric view that:
- Joins orders and customers across multiple tables using a snowflake schema
- Defines dimensions for time, geography, and order attributes
- Calculates simple and complex measures including ratios, filtered aggregations, and window measures
- Uses composability to build complex metrics from simpler measures
- Includes agent metadata for dashboards and AI tools
If you're new to metric views, start with Create and edit metric views to learn the basics. This tutorial extends that foundation with real-world complexity.
Requirements
To complete this tutorial, you need:
- Your workspace must be enabled for Unity Catalog.
- A SQL warehouse or compute resource running Databricks Runtime 17.3 or above.
The data model
The TPC-H dataset models a wholesale supply chain. This tutorial uses three tables joined in a snowflake schema:
ordersjoins tocustomerono_custkey = c_custkeycustomerjoins tonationonc_nationkey = n_nationkey
Table | Role | Key columns |
|---|---|---|
| Fact table (order transactions) |
|
| Dimension table (customer details) |
|
| Dimension table (country or region reference) |
|
Step 1: Open the YAML editor
This tutorial uses the YAML editor to create your metric view definition. For other methods, see Create and edit metric views.
To open the YAML editor:
- Click
Catalog in the workspace sidebar.
- Use the search bar to find
samples.tpch.orders. - Click the table name to show table details.
- Click Create > Metric view. In the Create metric view dialog, enter a name and select a catalog and schema destination. Then click Create.
- If necessary, click YAML to open the YAML editor.
Step 2: Set up the metric view
Start with the version and a descriptive comment.
version: 1.1
comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15
This example sets the version and adds a description:
version: 1.1determines the version of YAML specification.commentdocuments the metric view's purpose, which appears in Catalog Explorer.
Step 3: Define the source and joins
Define the primary source table and join related tables. The nation join is nested under the customer join to reflect the snowflake schema, where nation is a subdimension of customer.
source: SELECT * FROM samples.tpch.orders
joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey
filter: o_orderdate >= '1995-01-01'
This example defines the source table, joins, and filter:
sourcesets the fact table (orders) as the grain.joinsbrings in customer data using a many-to-one relationship.- The nested
nationjoin demonstrates a snowflake schema pattern, joining throughcustomerto reach geographic data. filterlimits to recent data, which applies to all queries on this metric view.
Step 4: Define dimensions
Dimensions are the attributes users group and filter by.
dimensions:
- name: order_date
expr: o_orderdate
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
- name: order_year
expr: YEAR(o_orderdate)
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
- name: customer_name
expr: customer.c_name
- name: market_segment
expr: customer.c_mktsegment
- name: customer_nation
expr: customer.nation.n_name
This example demonstrates dimension patterns:
- Time dimensions at multiple granularities (date, month, year) support different analysis needs.
CASEexpressions transform cryptic codes into business-friendly labels.- Joined columns reference tables using the alias defined in
joins(such ascustomer.c_name). - Nested join columns use chained dot notation (such as
customer.nation.n_name) to traverse the snowflake schema.
Step 5: Define measures
Measures are the calculations users want to analyze. Define atomic measures first, then use composability to build complex metrics that reference earlier-defined measures with the MEASURE() function.
measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
- name: total_revenue
expr: SUM(o_totalprice)
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
This example demonstrates measure patterns:
- Atomic measures (
order_count,total_revenue,unique_customers) are simple aggregations that form the building blocks. - Composed measures (
avg_order_value,revenue_per_customer) reference earlier-defined measures usingMEASURE()instead of duplicating aggregation logic. Iftotal_revenuechanges, these measures automatically use the updated definition. See Composability. FILTERclauses create conditional metrics without separate dimensions.- The
t7d_customerswindow measure calculates a rolling 7-day count of unique customers, useful for tracking engagement trends over time. See Window measures for more window measure patterns.
Step 6: Add agent metadata
Agent metadata enhances data visualization and improves LLM accuracy by providing display names, format specifications, and synonyms. Add these properties to your dimensions and measures to give business context to your metrics.
dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country
measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0
This example adds the following agent metadata:
display_nameprovides human-readable labels that appear in visualization tools instead of technical column names.formatdefines how values display (currency, number, percentage) in dashboards.synonymshelp AI tools like Genie discover dimensions and measures through natural language queries.
For complete details on agent metadata options, see Agent metadata in metric views.
Step 7: Complete YAML definition
Here's the full metric view definition:
version: 1.1
source: SELECT * FROM samples.tpch.orders
joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey
filter: o_orderdate >= '1995-01-01'
comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15
dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country
measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0
Create the metric view using SQL
If you're building this definition outside Catalog Explorer, run the following SQL to create the metric view. Paste the complete YAML from above between the $$ delimiters.
CREATE OR REPLACE VIEW catalog.schema.tpch_sales_analytics
WITH METRICS
LANGUAGE YAML
AS $$
version: 1.1
source: SELECT * FROM samples.tpch.orders
joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey
filter: o_orderdate >= '1995-01-01'
comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15
dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country
measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0
$$;
For other ways to create a metric view, see Create and edit metric views.
Step 8: Query your metric view
Now you can query using business-friendly syntax:
-- Aggregates total revenue, order count, and average order value
-- by customer nation and market segment, ranked by highest revenue first.
SELECT
customer_nation,
market_segment,
MEASURE(total_revenue) AS total_revenue,
MEASURE(order_count) AS order_count,
MEASURE(avg_order_value) AS avg_order_value
FROM catalog.schema.tpch_sales_analytics
GROUP BY customer_nation, market_segment
ORDER BY total_revenue DESC;
-- Monthly trend with backlog analysis
SELECT
order_month,
order_status,
MEASURE(total_revenue) AS total_revenue,
MEASURE(open_order_revenue) AS open_order_revenue
FROM catalog.schema.tpch_sales_analytics
GROUP BY order_month, order_status
ORDER BY order_month;
What you learned
You built a metric view that demonstrates:
Feature | Example |
|---|---|
Orders to customer to nation (nested many-to-one joins) | |
Date, month, year granularity | |
| |
| |
| |
| |
Rolling 7-day customer count using | |
|
Next steps
- Window measures to calculate rolling averages and year-to-date totals.
- Materialization for metric views to improve query performance for large datasets.
- Use metric views to use your metric view in AI/BI dashboards.