Use semantic metadata in metric views
This feature is in Public Preview.
Semantic metadata in metric views provides additional context and information that enhances data visualization and improves large language model (LLM) accuracy when working with metric views. Metadata includes display names, format specifications, and synonyms that help visualization tools, such as AI/BI dashboards, and natural language AI tools, such as Genie spaces, understand and work with your data more effectively.
Requires Databricks Runtime 17.2 or above. Metric view YAML definitions must use specification version 1.1 or above. See Version specification changelog for details.
What is semantic metadata?
Semantic metadata provides additional context and information for metric view dimensions and measures. It is defined in the YAML definition for the metric view. The following are the types of metadata you can include.
When you create or alter metric views with specification version 1.1, any single-line comments (denoted with #
) in the YAML definition are removed when the definition is saved. See Upgrade your YAML to 1.1 for options and recommendations when upgrading existing YAML definitions.
Display names
Display names provide human-readable labels that appear in visualization tools instead of technical column names. Display names are limited to 255 characters.
The following example shows display names defined on the order_date
dimension and total_revenue
measure.
version: 1.1
source: samples.tpch.orders
dimensions:
- name: order_date
expr: o_orderdate
display_name: 'Order Date'
measures:
- name: total_revenue
expr: SUM(o_totalprice)
display_name: 'Total Revenue'
Synonyms
Synonyms help LLM tools, such as AI/BI Genie, discover dimensions and measures through user input by providing alternative names. You can define synonyms using either block style or flow style YAML. Each dimension or measure can have up to 10 synonyms. Each synonym is limited to 255 characters.
The following example shows synonyms defined on the order_date
dimension:
version: 1.1
source: samples.tpch.orders
dimensions:
- name: order_date
expr: o_orderdate
# block style
synonyms:
- 'order time'
- 'date of order'
measures:
- name: total_revenue
expr: SUM(o_totalprice)
# flow style
synonyms: ['revenue', 'total sales']
Format specifications
Format specifications define how values should be displayed in visualization tools. The following tables include supported format types and examples.
Numeric Formats
Format Type | Required Options | Optional Options |
---|---|---|
Number: Use plain number format for general numeric values with optional decimal place control and abbreviation options. |
|
|
Currency: Use currency format for monetary values with ISO-4217 currency codes. |
|
|
Percentage: Use percentage format for ratio values expressed as percentages. |
|
|
Numeric formatting examples
- Number
- Currency
- Percentage
format:
type: number
decimal_places:
type: max
places: 2
hide_group_separator: false
abbreviation: compact
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
hide_group_separator: false
abbreviation: compact
format:
type: percentage
decimal_places:
type: all
hide_group_separator: true
Date & Time Formats
The following table explains how to work with date and time formats.
Format Type | Required Options | Optional Options |
---|---|---|
Date: Use date format for date values with various display options. |
|
|
DateTime: Use datetime format for timestamp values combining date and time. |
|
|
When working with a date_time
type, at least one of date_format
or time_format
must specify a value other than no_date
or no_time
.
Datetime formatting examples
- Date
- DateTime
format:
type: date
date_format: year_month_day
leading_zeros: true
format:
type: date_time
date_format: year_month_day
time_format: locale_hour_minute_second
leading_zeros: false
Complete example
The following example shows a metric view definition that includes all semantic metadata types:
version: 1.1
source: samples.tpch.orders
comment: Comprehensive sales metrics with enhanced semantic metadata
dimensions:
- name: order_date
expr: o_orderdate
comment: Date when the order was placed
display_name: Order Date
format:
type: date
date_format: year_month_day
leading_zeros: true
synonyms:
- order time
- date of order
- name: customer_segment
expr: |
CASE
WHEN o_totalprice > 100000 THEN 'Enterprise'
WHEN o_totalprice > 10000 THEN 'Mid-market'
ELSE 'SMB'
END
comment: Customer classification based on order value
display_name: Customer Segment
synonyms:
- segment
- customer tier
measures:
- name: total_revenue
expr: SUM(o_totalprice)
comment: Total revenue from all orders
display_name: Total Revenue
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
hide_group_separator: false
abbreviation: compact
synonyms:
- revenue
- total sales
- sales amount
- name: order_count
expr: COUNT(1)
comment: Total number of orders
display_name: Order Count
format:
type: number
decimal_places:
type: all
hide_group_separator: true
synonyms:
- count
- number of orders
- name: avg_order_value
expr: SUM(o_totalprice) / COUNT(1)
comment: Average revenue per order
display_name: Average Order Value
format:
type: currency
currency_code: USD
decimal_places:
type: exact
places: 2
synonyms:
- aov
- average revenue