View and query the billing logs
This article explains how admins can monitor usage in their SAP Databricks accounts using the billing system tables.
What are system tables?
System tables are an analytical store of your account's operational data found in the system
catalog. System tables can be used for historical observability across your account. They are read-only and cannot be modified.
To grant access to the billing
schema, a user that is both a metastore admin and an account admin must grant USE
and SELECT
permissions on the system schemas.
Available billing tables
There are two tables you can use to monitor costs:
system.billing.usage
: Gives you access to account-wide billable usage data.system.billing.list_prices
: Gives you access to a historical log of SKU pricing. A record gets added each time there is a change to a SKU price.
Billable usage table schem
Table path: This system table is located at system.billing.usage
.
The billable usage system table uses the following schema:
Column name | Data type | Description | Example |
---|---|---|---|
| string | Unique ID for this usage record |
|
| string | ID of the account this report was generated for |
|
| string | ID of the workspace this usage was associated with |
|
| string | Name of the SKU |
|
| string | Cloud associated with this usage. Possible values are |
|
| timestamp | The start time relevant to this usage record. Timezone information is recorded at the end of the value with |
|
| timestamp | The end time relevant to this usage record. Timezone information is recorded at the end of the value with |
|
| date | Date of the usage record, this field can be used for faster aggregation by date |
|
| map | Custom tags propagated from the serverless budget policy associated with the usage |
|
| string | Unit this usage is measured in |
|
| decimal | Number of units consumed for this record |
|
| struct | System-provided metadata about the usage, including IDs for compute resources and jobs (if applicable). See Usage Metadata. | See Usage metadata |
| struct | System-provided metadata about the identities involved in the usage. See Identity Metadata. | |
| string | Whether the record is original, a retraction, or a restatement. The value is |
|
| date | Date the record was ingested into the |
|
| string | The product that originated the usage. Some products can be billed as different SKUs. For possible values, see Product. |
|
| struct | Details about the specific product features used. |
|
| string | The type of usage attributed to the product or workload for billing purposes. Possible values are |
|
Pricing table schema
Table path: This system table is located at system.billing.list_prices
.
The pricing system table uses the following schema:
Column name | Data type | Description | Example |
---|---|---|---|
| timestamp | The time this price became effective in UTC |
|
| timestamp | The time this price stopped being effective in UTC |
|
| string | ID of the account this report was generated for |
|
| string | Name of the SKU |
|
| string | Name of the Cloud this price is applicable to. Possible values are |
|
| string | The currency this price is expressed in. The value will be |
|
| string | The unit of measurement that is monetized. Possible values are |
|
| struct | A structured data field that includes pricing info at the published list price rate. The key |
|
How to read the usage table
Every billing record in the usage
table includes columns that attribute the usage amount to the specific resources, identities, and products involved. To help attribute usage to specific workloads, features, and identities, use the following columns:
- The
usage_metadata
column includes a struct with information about the resources or objects involved in the usage. - The
identity_metadata
column includes information about the user or service principal that incurred the usage. - The
custom_tags
column includes tags added by serverless budget policies. For more information, see Attribute usage with serverless budget policies. - The
billing_origin_product
column give you information about the exact product used.
Usage metadata reference
The values in usage_metadata
are all strings that tell you about the workspace objects involved in the usage record.
Only a subset of these values is populated in any given usage record, depending on the compute type and features used. The following table outlines the values you can expect to populate in your SAP Databricks account.
Value | Description |
---|---|
| ID of the job associated with the usage record |
| ID of the SQL warehouse associated with the usage record |
| ID of the job run associated with the usage record |
| ID of the notebook associated with the usage |
| The name of the model serving endpoint or vector search endpoint associated with the usage record |
| ID of the model serving endpoint or vector search endpoint associated with the usage record |
| Unique user-facing name of the Foundation Model Fine-tuning run associated with the usage record |
| User-given name of the job associated with the usage record |
| Workspace storage path of the notebook associated with the usage |
| Region of the workspace associated with the usage. Only returns a value for serverless networking-related usage. |
| Region of the resource being accessed. Only returns a value for serverless networking-related usage. |
| ID of the serverless budget policy attached to the workload |
Identity metadata reference
The identity_metadata
column provides more information about the identities involved in the usage.
- The
run_as
field logs who ran the workload. This values is only populated for certain workload types listed in the table below. - The
owned_by
field only applies to SQL warehouse usage and logs the user or service principal who owns the SQL warehouse responsible for the usage.
run_as identities
The identity recorded in identity_metadata.run_as
depends on the product associated with the usage. Reference the following table for the identity_metadata.run_as
behavior:
Workload type | Identity of |
---|---|
Serverless compute for jobs | The user or service principal defined in the |
Serverless compute for notebooks | The user who ran the notebook commands (specifically, the user who created the notebook session). For shared notebooks, this includes usage by other users sharing the same notebook session. |
Foundation Model Fine-tuning | The user or service principal that initiated the fine-tuning training run. |
Lakehouse monitoring | The user who created the monitor. |
Billing origin product reference
The billing_origin_product
column shows the Databricks product associated with the usage record. The values include:
JOBS
SQL
MODEL_SERVING
INTERACTIVE
DEFAULT_STORAGE
VECTOR_SEARCH
LAKEHOUSE_MONITORING
FOUNDATION_MODEL_TRAINING
AGENT_EVALUATION
NETWORKING
: Costs associated with connecting serverless compute to your resources
Sample queries
Use the following queries to gain insights into usage in your account:
- Join the list_prices table with the usage table
- Identify expensive notebooks
- Identify expensive jobs
- Report on DBUs consumed by a particular user
- Report on DBUs consumed by workloads that share a custom tag
Join the list_prices table with the usage table
The list_prices
table includes list prices over time for each available SKU. You can join the usage
table to view the listing cost of certain usage.
For example, the following query returns the total cost attributed to a particular user during a month. Be sure to insert a valid identity in the bracketed section.
SELECT
SUM(usage.usage_quantity * list_prices.pricing.effective_list.default)
as `Total Dollar Cost`
FROM system.billing.usage
JOIN system.billing.list_prices ON list_prices.sku_name = usage.sku_name
WHERE identity_metadata.run_as = {{'sample identity'}}
AND usage.usage_end_time >= list_prices.price_start_time
AND (list_prices.price_end_time IS NULL OR usage.usage_end_time < list_prices.price_end_time)
AND usage.usage_date BETWEEN "2025-04-01" AND "2025-04-15"
Identify expensive notebooks
This query returns a list of notebooks and how many DBUs each notebook consumed, in descending order by DBU consumption:
SELECT
usage_metadata.notebook_id,
usage_metadata. notebook_path,
SUM(usage_quantity) as total_dbu
FROM
system.billing.usage
WHERE
usage_metadata.notebook_id is not null
and billing_origin_product = 'INTERACTIVE'
and product_features.is_serverless
and usage_unit = 'DBU'
and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
1,2
ORDER BY
total_dbu DESC
Identify expensive jobs
This query returns a list of jobs and how many DBUs each job consumed, in descending order by DBU consumption:
SELECT
usage_metadata.job_id,
usage_metadata.job_name,
SUM(usage_quantity) as total_dbu
FROM
system.billing.usage
WHERE
usage_metadata.job_id is not null
and usage_unit = 'DBU'
and usage_date >= DATEADD(day, -30, current_date)
and sku_name like '%JOBS_SERVERLESS_COMPUTE%'
GROUP BY
1,2
ORDER BY
total_dbu DESC
Report on DBUs consumed by a particular user
This query returns a list of notebooks and jobs run by a particular user or service principal, and the number of DBUs consumed by each workload. Be sure to insert a valid user email address using the parameters.
SELECT
usage_metadata.job_id,
usage_metadata.job_name,
usage_metadata.notebook_id,
usage_metadata. notebook_path,
SUM(usage_quantity) as total_dbu
FROM
system.billing.usage
WHERE
identity_metadata.run_as = format_string('%s@%s.%s', :user_name, :domain_name, :top_level_domain)
and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
1,2,3,4
ORDER BY
total_dbu DESC
Report on DBUs consumed by workloads that share a custom tag
This query returns a list of jobs that share the same custom tag, and the number of DBUs consumed by each workload:
SELECT
usage_metadata.job_id,
usage_metadata.job_name,
usage_metadata.notebook_id,
usage_metadata. notebook_path,
SUM(usage_quantity) as total_dbu
FROM
system.billing.usage
WHERE
custom_tags.<key> = '<value>'
and billing_origin_product in ('JOBS','INTERACTIVE')
and product_features.is_serverless
and usage_unit = 'DBU'
and usage_date >= DATEADD(day, -30, current_date)
GROUP BY
1,2,3,4
ORDER BY
total_dbu DESC