Monitor Unity AI Gateway cost
This feature is in Beta.
Observe and analyze cost for all Unity AI Gateway traffic by endpoint, target model, requesting principal, and tags.
Cost observability is based on Databricks billing records. For request-level usage analytics such as token counts, latency, requester details, and request tags, see Monitor usage for Unity AI Gateway endpoints.
Requirements
- Unity AI Gateway enabled for your account.
- A Databricks workspace in a Unity AI Gateway supported region.
- The billable usage system table enabled for your account. See Enable system tables.
Attribution
Unity AI Gateway provides cost attribution through the billable usage system table (system.billing.usage).
Unity AI Gateway enriches MODEL_SERVING billing records in system.billing.usage with endpoint-specific metadata so that Databricks cost can be attributed to the associated endpoints, target models, principals, and endpoint tags. For the complete schema and field definitions, see the Billing usage system table reference.
The billable usage system table includes cost attribution for Databricks-hosted models. For external model cost analysis in the dashboard, see External model cost.
For requests served through a Unity AI Gateway endpoint, Databricks populates the following fields on MODEL_SERVING records in system.billing.usage:
Field | Description |
|---|---|
| The name of the Unity AI Gateway endpoint that received the request. |
| The ID of the Unity AI Gateway endpoint. |
| The destination model that handled the request, for example |
| The ID of the target that handled the request. |
| The user or Databricks service principal that issued the request. |
| Endpoint tags configured on the Unity AI Gateway endpoint, such as |
These fields are populated for both real-time and batch inference requests routed through Unity AI Gateway endpoints.
Observability
The built-in usage dashboard includes a Cost Analysis page for monitoring cost and analyzing cost breakdowns over time. You can analyze cost across multiple dimensions, including:
- Endpoint
- Target model
- Requesting user or service principal
- Endpoint tags
- Request tags
To open the dashboard, click View Dashboard from the AI Gateway page. For details on importing and updating the dashboard, see Built-in usage dashboard.


Cost observability is available in dashboard version 0.4 and above. Account admins must update the dashboard to receive the latest template changes. See Built-in usage dashboard.
Tag-based analysis
The Cost Analysis page includes tag-based views and filters so you can analyze cost using endpoint tags and request tags.
Endpoint tags are configured on the Unity AI Gateway endpoint and apply to all requests sent to that endpoint. Request tags are attached to individual requests and enable more granular attribution within the same endpoint, such as by project, feature, environment, or end user.
Tag filters accept a semicolon-separated list in the format <entry1>;<entry2>;<entry3>, where each entry is specified as either:
<key>to match all values for a tag key. For example,teammatches all requests with theteamtag.<key>=<value>to match a specific tag key-value pair. For example,team=ml-platform;env=prodmatches requests tagged withteam=ml-platformandenv=prod.
For information about configuring and querying request tags, see Tag requests and endpoints for usage tracking.
External model cost
The usage dashboard can be configured to include cost estimates for external models by specifying a model pricing table in the Pricing Table Override setting. The pricing table is user-managed and must be provided as input to the dashboard.

The pricing table must include the following fields:
Field | Type | Description |
|---|---|---|
| STRING | The model name used for cost attribution in the dashboard. |
| DOUBLE | The price for input tokens. |
| DOUBLE | The price for output tokens. |
| DOUBLE | The price for cache-read input tokens, when supported. |
| DOUBLE | The price for cache-write input tokens, when supported. |
Cost estimates for external models are for informational purposes only. These figures are calculated based on list or override prices and might not reflect your final provider invoice. Databricks is not liable for discrepancies in third-party billing.
Analyzing cost
The following queries analyze cost for Databricks-hosted models in system.billing.usage. Cost can be broken down by endpoint, target model, principal, and endpoint tag.
By endpoint
SELECT
usage_metadata.ai_gateway_endpoint_name AS endpoint_name,
SUM(usage_quantity) AS dbus
FROM system.billing.usage
WHERE billing_origin_product = 'MODEL_SERVING'
AND usage_metadata.ai_gateway_endpoint_name IS NOT NULL
AND usage_unit = 'DBU'
AND usage_date >= current_date() - INTERVAL 30 DAYS
GROUP BY endpoint_name
ORDER BY dbus DESC;
By destination model
SELECT
usage_metadata.ai_gateway_destination_model AS destination_model,
SUM(usage_quantity) AS dbus
FROM system.billing.usage
WHERE billing_origin_product = 'MODEL_SERVING'
AND usage_metadata.ai_gateway_endpoint_name IS NOT NULL
AND usage_unit = 'DBU'
AND usage_date >= current_date() - INTERVAL 30 DAYS
GROUP BY destination_model
ORDER BY dbus DESC;
By user or Databricks service principal
SELECT
identity_metadata.run_by AS run_by,
SUM(usage_quantity) AS dbus
FROM system.billing.usage
WHERE billing_origin_product = 'MODEL_SERVING'
AND usage_metadata.ai_gateway_endpoint_name IS NOT NULL
AND identity_metadata.run_by IS NOT NULL
AND usage_unit = 'DBU'
AND usage_date >= current_date() - INTERVAL 30 DAYS
GROUP BY run_by
ORDER BY dbus DESC;
By endpoint tag
Endpoint tags propagate to the billing records in custom_tags, which makes it possible to allocate cost by dimensions such as team, environment, project, or cost center.
SELECT
custom_tags['team'] AS team,
SUM(usage_quantity) AS dbus
FROM system.billing.usage
WHERE billing_origin_product = 'MODEL_SERVING'
AND usage_metadata.ai_gateway_endpoint_name IS NOT NULL
AND custom_tags['team'] IS NOT NULL
AND usage_unit = 'DBU'
AND usage_date >= current_date() - INTERVAL 30 DAYS
GROUP BY team
ORDER BY dbus DESC;
To add tags such as team, project, or cost_center to an Unity AI Gateway endpoint, see Configure Unity AI Gateway endpoints.
Limitations
- Spend attribution applies to
MODEL_SERVINGrecords insystem.billing.usage. Requests routed to external models that are billed directly by the external provider do not appear insystem.billing.usage. - For Unity AI Gateway endpoints with multiple destinations, such as traffic splitting or fallbacks,
ai_gateway_destination_modelandai_gateway_destination_ididentify the destination that ultimately served the request.