Skip to main content

Tutorial: Build a complete metric view with joins

This tutorial walks you through building a comprehensive sales analytics metric view using the TPC-H dataset. By the end, you'll have a metric view that:

  • Joins orders and customers across multiple tables using a snowflake schema
  • Defines dimensions for time, geography, and order attributes
  • Calculates simple and complex measures including ratios, filtered aggregations, and window measures
  • Uses composability to build complex metrics from simpler measures
  • Includes agent metadata for dashboards and AI tools

If you're new to metric views, start with Create and edit metric views to learn the basics. This tutorial extends that foundation with real-world complexity.

Requirements

To complete this tutorial, you need:

  • Your workspace must be enabled for Unity Catalog.
  • A SQL warehouse or compute resource running Databricks Runtime 17.3 or above.

The data model

The TPC-H dataset models a wholesale supply chain. This tutorial uses three tables joined in a snowflake schema:

  • orders joins to customer on o_custkey = c_custkey
  • customer joins to nation on c_nationkey = n_nationkey

Table

Role

Key columns

orders

Fact table (order transactions)

o_orderkey, o_custkey, o_totalprice, o_orderdate, o_orderstatus

customer

Dimension table (customer details)

c_custkey, c_name, c_mktsegment, c_nationkey

nation

Dimension table (country or region reference)

n_nationkey, n_name, n_regionkey

Step 1: Open the YAML editor

This tutorial uses the YAML editor to create your metric view definition. For other methods, see Create and edit metric views.

To open the YAML editor:

  1. Click Data icon. Catalog in the workspace sidebar.
  2. Use the search bar to find samples.tpch.orders.
  3. Click the table name to show table details.
  4. Click Create > Metric view. In the Create metric view dialog, enter a name and select a catalog and schema destination. Then click Create.
  5. If necessary, click YAML to open the YAML editor.

Step 2: Set up the metric view

Start with the version and a descriptive comment.

YAML
version: 1.1

comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15

This example sets the version and adds a description:

  • version: 1.1 determines the version of YAML specification.
  • comment documents the metric view's purpose, which appears in Catalog Explorer.

Step 3: Define the source and joins

Define the primary source table and join related tables. The nation join is nested under the customer join to reflect the snowflake schema, where nation is a subdimension of customer.

YAML
source: SELECT * FROM samples.tpch.orders

joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey

filter: o_orderdate >= '1995-01-01'

This example defines the source table, joins, and filter:

  • source sets the fact table (orders) as the grain.
  • joins brings in customer data using a many-to-one relationship.
  • The nested nation join demonstrates a snowflake schema pattern, joining through customer to reach geographic data.
  • filter limits to recent data, which applies to all queries on this metric view.

Step 4: Define dimensions

Dimensions are the attributes users group and filter by.

YAML
dimensions:
- name: order_date
expr: o_orderdate

- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"

- name: order_year
expr: YEAR(o_orderdate)

- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END

- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"

- name: customer_name
expr: customer.c_name

- name: market_segment
expr: customer.c_mktsegment

- name: customer_nation
expr: customer.nation.n_name

This example demonstrates dimension patterns:

  • Time dimensions at multiple granularities (date, month, year) support different analysis needs.
  • CASE expressions transform cryptic codes into business-friendly labels.
  • Joined columns reference tables using the alias defined in joins (such as customer.c_name).
  • Nested join columns use chained dot notation (such as customer.nation.n_name) to traverse the snowflake schema.

Step 5: Define measures

Measures are the calculations users want to analyze. Define atomic measures first, then use composability to build complex metrics that reference earlier-defined measures with the MEASURE() function.

YAML
measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)

- name: total_revenue
expr: SUM(o_totalprice)

- name: unique_customers
expr: COUNT(DISTINCT o_custkey)

- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)

- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)

- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')

- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')

- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day

This example demonstrates measure patterns:

  • Atomic measures (order_count, total_revenue, unique_customers) are simple aggregations that form the building blocks.
  • Composed measures (avg_order_value, revenue_per_customer) reference earlier-defined measures using MEASURE() instead of duplicating aggregation logic. If total_revenue changes, these measures automatically use the updated definition. See Composability.
  • FILTER clauses create conditional metrics without separate dimensions.
  • The t7d_customers window measure calculates a rolling 7-day count of unique customers, useful for tracking engagement trends over time. See Window measures for more window measure patterns.

Step 6: Add agent metadata

Agent metadata enhances data visualization and improves LLM accuracy by providing display names, format specifications, and synonyms. Add these properties to your dimensions and measures to give business context to your metrics.

YAML
dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country

measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0

This example adds the following agent metadata:

  • display_name provides human-readable labels that appear in visualization tools instead of technical column names.
  • format defines how values display (currency, number, percentage) in dashboards.
  • synonyms help AI tools like Genie discover dimensions and measures through natural language queries.

For complete details on agent metadata options, see Agent metadata in metric views.

Step 7: Complete YAML definition

Here's the full metric view definition:

YAML
version: 1.1

source: SELECT * FROM samples.tpch.orders

joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey

filter: o_orderdate >= '1995-01-01'

comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15

dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country

measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0

Create the metric view using SQL

If you're building this definition outside Catalog Explorer, run the following SQL to create the metric view. Paste the complete YAML from above between the $$ delimiters.

SQL
CREATE OR REPLACE VIEW catalog.schema.tpch_sales_analytics
WITH METRICS
LANGUAGE YAML
AS $$
version: 1.1

source: SELECT * FROM samples.tpch.orders

joins:
- name: customer
source: samples.tpch.customer
'on': o_custkey = c_custkey
joins:
- name: nation
source: samples.tpch.nation
'on': c_nationkey = n_nationkey

filter: o_orderdate >= '1995-01-01'

comment: |-
Sales analytics metric view for order performance analysis.
Joins orders with customers and geography.
Owner: Analytics Team
Last updated: 2025-01-15

dimensions:
- name: order_date
expr: o_orderdate
display_name: Order Date
- name: order_month
expr: "DATE_TRUNC('MONTH', o_orderdate)"
display_name: Order Month
- name: order_year
expr: YEAR(o_orderdate)
display_name: Order Year
- name: order_status
expr: |-
CASE o_orderstatus
WHEN 'O' THEN 'Open'
WHEN 'P' THEN 'Processing'
WHEN 'F' THEN 'Fulfilled'
END
display_name: Order Status
synonyms:
- status
- fulfillment status
- name: order_priority
expr: "SPLIT(o_orderpriority, '-')[0]"
display_name: Priority
- name: customer_name
expr: customer.c_name
display_name: Customer Name
- name: market_segment
expr: customer.c_mktsegment
display_name: Market Segment
synonyms:
- segment
- industry
- name: customer_nation
expr: customer.nation.n_name
display_name: Country
synonyms:
- nation
- country

measures:
- name: order_count
expr: COUNT(DISTINCT o_orderkey)
display_name: Order Count
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: total_revenue
expr: SUM(o_totalprice)
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- revenue
- sales
- name: unique_customers
expr: COUNT(DISTINCT o_custkey)
display_name: Unique Customers
format:
type: number
decimal_places:
type: exact
places: 0
abbreviation: compact
- name: avg_order_value
expr: MEASURE(total_revenue) / MEASURE(order_count)
display_name: Avg Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- AOV
- name: revenue_per_customer
expr: MEASURE(total_revenue) / MEASURE(unique_customers)
display_name: Revenue per Customer
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: open_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'O')
display_name: Open Order Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
synonyms:
- backlog
- name: fulfilled_order_revenue
expr: SUM(o_totalprice) FILTER (WHERE o_orderstatus = 'F')
display_name: Fulfilled Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
abbreviation: compact
- name: t7d_customers
expr: COUNT(DISTINCT o_custkey)
window:
- order: order_date
semiadditive: last
range: trailing 7 day
display_name: 7-Day Rolling Customers
format:
type: number
decimal_places:
type: exact
places: 0
$$;

For other ways to create a metric view, see Create and edit metric views.

Step 8: Query your metric view

Now you can query using business-friendly syntax:

SQL
-- Aggregates total revenue, order count, and average order value
-- by customer nation and market segment, ranked by highest revenue first.
SELECT
customer_nation,
market_segment,
MEASURE(total_revenue) AS total_revenue,
MEASURE(order_count) AS order_count,
MEASURE(avg_order_value) AS avg_order_value
FROM catalog.schema.tpch_sales_analytics
GROUP BY customer_nation, market_segment
ORDER BY total_revenue DESC;
SQL
-- Monthly trend with backlog analysis
SELECT
order_month,
order_status,
MEASURE(total_revenue) AS total_revenue,
MEASURE(open_order_revenue) AS open_order_revenue
FROM catalog.schema.tpch_sales_analytics
GROUP BY order_month, order_status
ORDER BY order_month;

What you learned

You built a metric view that demonstrates:

Feature

Example

Snowflake schema joins

Orders to customer to nation (nested many-to-one joins)

Time dimensions

Date, month, year granularity

Transformed dimensions

CASE statements, SPLIT functions

Simple measures

COUNT, SUM

Composability

avg_order_value and revenue_per_customer reference earlier-defined measures using MEASURE()

Filtered measures

FILTER (WHERE ...) for conditional aggregations

Window measures

Rolling 7-day customer count using trailing 7 day

Agent metadata

display_name, format, synonyms

Next steps