Billable usage system table reference
This article provides an overview of the billable usage system table, including the schema and example queries. With system tables, your account’s billable usage data is centralized and routed to all regions, so you can view your account’s global usage from whichever region your workspace is in.
For information on using this table to monitor costs and sample queries, see Monitor costs using system tables.
Table path: This system table is located at system.billing.usage
.
Billable usage table schema
The billable usage system table uses the following schema:
Column name | Data type | Description | Example |
---|---|---|---|
| string | Unique ID for this usage record |
|
| string | ID of the account this report was generated for |
|
| string | ID of the workspace this usage was associated with |
|
| string | Name of the SKU |
|
| string | Cloud associated with this usage. Possible values are |
|
| timestamp | The start time relevant to this usage record. Timezone information is recorded at the end of the value with |
|
| timestamp | The end time relevant to this usage record. Timezone information is recorded at the end of the value with |
|
| date | Date of the usage record, this field can be used for faster aggregation by date |
|
| map | Custom tags associated with the usage record |
|
| string | Unit this usage is measured in |
|
| decimal | Number of units consumed for this record |
|
| struct | System-provided metadata about the usage, including IDs for compute resources and jobs (if applicable). See Usage Metadata. | See Usage metadata |
| struct | System-provided metadata about the identities involved in the usage. See Identity Metadata. | |
| string | Whether the record is original, a retraction, or a restatement. The value is |
|
| date | Date the record was ingested into the |
|
| string | The product that originated the usage. Some products can be billed as different SKUs. For possible values, see Product. |
|
| struct | Details about the specific product features used. See Product features. | See Product features |
| string | The type of usage attributed to the product or workload for billing purposes. Possible values are |
|
Usage metadata reference
The values in usage_metadata
are all strings that tell you about the workspace objects and resources involved in the usage record.
Only a subset of these values is populated in any given usage record, depending on the compute type and features used. The third column in the table shows which usage types cause each value to be populated.
Value | Description | Populated for (otherwise |
---|---|---|
| ID of the cluster associated with the usage record | Non-serverless compute usage, including notebooks, jobs, DLT, and legacy model serving |
| ID of the job associated with the usage record | Serverless jobs and jobs run on job compute (does not populate for jobs run on all-purpose compute) |
| ID of the SQL warehouse associated with the usage record | Workloads run on a SQL warehouse |
| ID of the instance pool associated with the usage record | Non-serverless compute usage from pools, including notebooks, jobs, DLT, and legacy model serving |
| The instance type of the compute resource | Non-serverless compute usage, including notebooks, jobs, DLT, and all SQL warehouses |
| ID of the job run associated with the usage record | Serverless jobs and jobs run on job compute (does not populate for jobs run on all-purpose compute) |
| ID of the notebook associated with the usage | Serverless notebooks |
| ID of the DLT pipeline associated with the usage record | DLT and features that use DLT pipelines, such as materialized views, online tables, vector search indexing, and Lakeflow Connect |
| The name of the model serving endpoint or vector search endpoint associated with the usage record | Model serving and Vector Search |
| ID of the model serving endpoint or vector search endpoint associated with the usage record | Model serving and Vector Search |
| ID of the DLT pipeline update associated with the usage record | DLT and features that use DLT pipelines, such as materialized views, online tables, vector search indexing, and Lakeflow Connect |
| ID of the DLT pipeline maintenance tasks associated with the usage record | DLT and features that use DLT pipelines, such as materialized views, online tables, vector search indexing, and Lakeflow Connect |
| ID of the metastore associated with the default storage | |
| Unique user-facing name of the Foundation Model Fine-tuning run associated with the usage record | Foundation Model Fine-tuning |
| User-given name of the job associated with the usage record | Jobs run on serverless compute |
| Workspace storage path of the notebook associated with the usage | Notebooks run on serverless compute |
| ID of the central clean room associated with the usage record | Clean Rooms |
| Region of the workspace associated with the usage. Only returns a value for serverless networking-related usage. | |
| Region of the resource being accessed. Only returns a value for serverless networking-related usage. | |
| ID of the app associated with the usage record | Databricks Apps |
| User-given name of the app associated with the usage record | Databricks Apps |
| This value is not populated in Databricks on AWS | Always |
| ID of the serverless budget policy attached to the workload | Serverless compute usage, including notebooks, jobs, DLT, and model serving endpoints |
Identity metadata reference
The identity_metadata
column provides more information about the identities involved in the usage.
- The
run_as
field logs who ran the workload. This values is only populated for certain workload types listed in the table below. - The
owned_by
field only applies to SQL warehouse usage and logs the user or service principal who owns the SQL warehouse responsible for the usage.
- The
identity_metadata.created_by
field applies to Databricks Apps and logs the email of the user who created the app.
run_as identities
The identity recorded in identity_metadata.run_as
depends on the product associated with the usage. Reference the following table for the identity_metadata.run_as
behavior:
Workload type | Identity of |
---|---|
Jobs compute | The user or service principal defined in the |
Serverless compute for jobs | The user or service principal defined in the |
Serverless compute for notebooks | The user who ran the notebook commands (specifically, the user who created the notebook session). For shared notebooks, this includes usage by other users sharing the same notebook session. |
DLT pipelines | The user whose permissions are used to run the DLT pipeline. This can be changed by transferring the pipeline’s ownership. |
Foundation Model Fine-tuning | The user or service principal that initiated the fine-tuning training run. |
Predictive optimization | The Databricks-owned service principal that runs predictive optimization operations. |
Lakehouse monitoring | The user who created the monitor. |
In workspaces enabled for the FedRamp compliance standard, all non-null values in the identity_metadata
column will be replaced with __REDACTED__
.
Record type reference
The billing.usage
table supports corrections. Corrections occur when any field of the usage record is incorrect and must be fixed.
When a correction happens, Databricks adds two new records to the table. A retraction record negates the original incorrect record, then a restatement record includes the corrected information. Correction records are identified using the record_type
field:
RETRACTION
: Used to negate the original incorrect usage. All fields are identical to theORIGINAL
record exceptusage_quantity
, which is a negative value that cancels out the original usage quantity. For example, if the original record’s usage quantity was259.4356
, then the retraction record would have a usage quantity of-259.4356
.RESTATEMENT
: The record that includes the correct fields and usage quantity.
For example, the following query returns the correct hourly usage quantity related to a job_id
, even if corrections have been made. By aggregating the usage quantity, the retraction record negates the original record and only the restatement’s values are returned.
SELECT
usage_metadata.job_id, usage_start_time, usage_end_time,
SUM(usage_quantity) as usage_quantity
FROM system.billing.usage
GROUP BY ALL
HAVING usage_quantity != 0
For corrections where the original usage record should not have been written, a correction may only add a retraction record and no restatement record.
Billing origin product reference
Some Databricks products are billed under the same shared SKU. For example, Lakehouse Monitoring, predictive optimization, and serverless workflows are all billed under the same serverless jobs SKU.
To help you differentiate usage, the billing_origin_product
and product_features
columns provide more insight into the specific product and features associated with the usage.
The billing_origin_product
column shows the Databricks product associated with the usage record. The values include:
JOBS
DLT
SQL
ALL_PURPOSE
MODEL_SERVING
INTERACTIVE
DEFAULT_STORAGE
VECTOR_SEARCH
LAKEHOUSE_MONITORING
PREDICTIVE_OPTIMIZATION
ONLINE_TABLES
FOUNDATION_MODEL_TRAINING
AGENT_EVALUATION
FINE_GRAINED_ACCESS_CONTROL
NETWORKING
: Costs associated with connecting serverless compute to your resourcesAPPS
: Costs associated with building and running Databricks Apps
Product features reference
The product_features
column is an object containing information about the specific product features used and includes the following key/value pairs:
jobs_tier
: values includeLIGHT
,CLASSIC
, ornull
sql_tier
: values includeCLASSIC
,PRO
, ornull
dlt_tier
: values includeCORE
,PRO
,ADVANCED
, ornull
is_serverless
: values includetrue
orfalse
, ornull
is_photon
: values includetrue
orfalse
, ornull
serving_type
: values includeMODEL
,GPU_MODEL
,FOUNDATION_MODEL
,FEATURE
, ornull
networking.connectivity_type
: values includePUBLIC_IP
andPRIVATE_IP