Use semantic metadata in metric views
This feature is in Public Preview.
This page explains how to use semantic metadata in metric views to enhance data visualization and improve large language model (LLM) accuracy.
Requires Databricks Runtime 17.2 or above. Metric view YAML definitions must use specification version 1.1 or above. See Version specification changelog for details.
What is semantic metadata?
Semantic metadata includes display names, format specifications, and synonyms that provide additional context. This metadata helps visualization tools, such as AI/BI dashboards, and natural language tools, such as Genie spaces, interpret and work with your data more effectively. Semantic metadata is defined in the YAML definition of a metric view.
When you create or alter metric views with specification version 1.1, any single-line comments (denoted with #) in the YAML definition are removed when the definition is saved. See  Upgrade your YAML to 1.1 for options and recommendations when upgrading existing YAML definitions.
Display names
Display names provide human-readable labels that appear in visualization tools instead of technical column names. Display names are limited to 255 characters.
The following example shows display names defined on the order_date dimension and total_revenue measure.
version: 1.1
source: samples.tpch.orders
dimensions:
  - name: order_date
    expr: o_orderdate
    display_name: 'Order Date'
measures:
  - name: total_revenue
    expr: SUM(o_totalprice)
    display_name: 'Total Revenue'
Synonyms
Synonyms help LLM tools, such as AI/BI Genie, discover dimensions and measures through user input by providing alternative names. You can define synonyms using either block style or flow style YAML. Each dimension or measure can have up to 10 synonyms. Each synonym is limited to 255 characters.
The following example shows synonyms defined on the order_date dimension:
version: 1.1
source: samples.tpch.orders
dimensions:
  - name: order_date
    expr: o_orderdate
    # block style
    synonyms:
      - 'order time'
      - 'date of order'
measures:
  - name: total_revenue
    expr: SUM(o_totalprice)
    # flow style
    synonyms: ['revenue', 'total sales']
Format specifications
Format specifications define how values should be displayed in visualization tools. The following tables include supported format types and examples.
Numeric formats
| Format Type | Required Options | Optional Options | 
|---|---|---|
| Number: Use plain number format for general numeric values with optional decimal place control and abbreviation options. | 
 | 
 | 
| Currency: Use currency format for monetary values with ISO-4217 currency codes. | 
 | 
 | 
| Percentage: Use percentage format for ratio values expressed as percentages. | 
 | 
 | 
Numeric formatting examples
- Number
- Currency
- Percentage
format:
  type: number
  decimal_places:
    type: max
    places: 2
  hide_group_separator: false
  abbreviation: compact
format:
  type: currency
  currency_code: USD
  decimal_places:
    type: exact
    places: 2
  hide_group_separator: false
  abbreviation: compact
format:
  type: percentage
  decimal_places:
    type: all
  hide_group_separator: true
Date and time formats
The following table explains how to work with date and time formats.
| Format Type | Required Options | Optional Options | 
|---|---|---|
| Date: Use date format for date values with various display options. | 
 | 
 | 
| DateTime: Use datetime format for timestamp values combining date and time. | 
 | 
 | 
When working with a date_time type, at least one of date_format or time_format must specify a value other than no_date or no_time.
Datetime formatting examples
- Date
- DateTime
format:
  type: date
  date_format: year_month_day
  leading_zeros: true
format:
  type: date_time
  date_format: year_month_day
  time_format: locale_hour_minute_second
  leading_zeros: false
Complete example
The following example shows a metric view definition that includes all semantic metadata types:
version: 1.1
source: samples.tpch.orders
comment: Comprehensive sales metrics with enhanced semantic metadata
dimensions:
  - name: order_date
    expr: o_orderdate
    comment: Date when the order was placed
    display_name: Order Date
    format:
      type: date
      date_format: year_month_day
      leading_zeros: true
    synonyms:
      - order time
      - date of order
  - name: customer_segment
    expr: |
      CASE
        WHEN o_totalprice > 100000 THEN 'Enterprise'
        WHEN o_totalprice > 10000 THEN 'Mid-market'
        ELSE 'SMB'
      END
    comment: Customer classification based on order value
    display_name: Customer Segment
    synonyms:
      - segment
      - customer tier
measures:
  - name: total_revenue
    expr: SUM(o_totalprice)
    comment: Total revenue from all orders
    display_name: Total Revenue
    format:
      type: currency
      currency_code: USD
      decimal_places:
        type: exact
        places: 2
      hide_group_separator: false
      abbreviation: compact
    synonyms:
      - revenue
      - total sales
      - sales amount
  - name: order_count
    expr: COUNT(1)
    comment: Total number of orders
    display_name: Order Count
    format:
      type: number
      decimal_places:
        type: all
      hide_group_separator: true
    synonyms:
      - count
      - number of orders
  - name: avg_order_value
    expr: SUM(o_totalprice) / COUNT(1)
    comment: Average revenue per order
    display_name: Average Order Value
    format:
      type: currency
      currency_code: USD
      decimal_places:
        type: exact
        places: 2
    synonyms:
      - aov
      - average revenue