Monitor usage using cluster and pool tags

To monitor cost and accurately attribute Databricks usage to your organization’s business units and teams (for chargebacks, for example), you can tag clusters and pools. These tags propagate both to detailed DBU usage reports and to AWS EC2 and AWS EBS instances for cost analysis.

Tagged objects and resources

You can add custom tags for the following objects managed by Databricks:

Object

Tagging interface (UI)

Tagging interface (API)

Pool

Pool UI in the Databricks workspace

Instance Pool API

Cluster

Cluster UI in the Databricks workspace

Clusters API

Warning

Do not assign a custom tag with the key Name to a cluster. Every cluster has a tag Name whose value is set by Databricks. If you change the value associated with the key Name, the cluster can no longer be tracked by Databricks. As a consequence, the cluster might not be terminated after becoming idle and will continue to incur usage costs.

Databricks adds the following default tags to all pools and clusters:

Pool tag key name

Value

Vendor

Constant “Databricks”

DatabricksInstancePoolCreatorId

Databricks internal identifier of the user who created the pool

DatabricksInstancePoolId

Databricks internal identifier of the pool

Cluster tag key name

Value

Vendor

Constant “Databricks”

ClusterId

Databricks internal identifier of the cluster

ClusterName

Name of the cluster

Creator

Username (email address) of the user who created the cluster

On job clusters, Databricks also applies the following default tags:

Cluster tag key name

Value

RunName

Job name

JobId

Job ID

On resources used by Databricks SQL, Databricks also applies the following default tag:

Cluster tag key name

Value

SqlEndpointId

Databricks internal identifier of the SQL endpoint

Tag propagation

Tags are propagated to AWS EC2 instances differently depending on whether or not a cluster was created from a pool.

cluster and pool tag propagation

If a cluster is created from a pool, its EC2 instances inherit only the custom and default pool tags, not the cluster tags. Therefore if you want to create clusters from a pool, make sure to assign all of the custom cluster tags you need to the pool.

If a cluster is not created from a pool, its tags propagate as expected to EC2 instances.

Cluster and pool tags both propagate to DBU usage reports, whether or not the cluster was created from a pool.

If there is a tag name conflict, Databricks default tags take precedence over custom tags and pool tags take precedence over cluster tags.

Limitations

  • Tag keys and values can contain only characters from the ISO 8859-1 (latin1) set. Tags containing other characters are ignored.

  • If you change tag keys names or values, these changes apply only after cluster restart or pool expansion.