Monitor usage using cluster and pool tags
To monitor cost and accurately attribute Databricks usage to your organization’s business units and teams (for chargebacks, for example), you can tag clusters and pools. These tags propagate both to detailed DBU usage reports and to AWS EC2 and AWS EBS instances for cost analysis.
Tagged objects and resources
You can add custom tags for the following objects managed by Databricks:
Object |
Tagging interface (UI) |
Tagging interface (API) |
---|---|---|
Pool |
Pool UI in the Databricks workspace |
|
Cluster |
Cluster UI in the Databricks workspace |
Warning
Do not assign a custom tag with the key Name
to a cluster. Every cluster has a tag Name
whose value is set by Databricks. If you change the value associated with the key Name
, the cluster can no longer be tracked by Databricks. As a consequence, the cluster might not be terminated after becoming idle and will continue to incur usage costs.
Databricks adds the following default tags to all pools and clusters:
Pool tag key name |
Value |
---|---|
|
Constant value: |
|
Databricks internal ID of the user who created the pool |
|
Databricks internal ID of the pool |
Cluster tag key name |
Value |
---|---|
|
Constant value: |
|
Databricks internal ID of the cluster |
|
Name of the cluster |
|
Username (email address) of the user who created the cluster |
On job clusters, Databricks also applies the following default tags:
Cluster tag key name |
Value |
---|---|
|
Job name |
|
Job ID |
On resources used by Databricks SQL, Databricks also applies the following default tag:
Cluster tag key name |
Value |
---|---|
|
Databricks internal identifier of the SQL warehouse |
Tag propagation
Tags are propagated to AWS EC2 instances differently depending on whether or not a cluster was created from a pool.

If a cluster is created from a pool, its EC2 instances inherit only the custom and default pool tags, not the cluster tags. Therefore if you want to create clusters from a pool, make sure to assign all of the custom cluster tags you need to the pool.
If a cluster is not created from a pool, its tags propagate as expected to EC2 instances.
Cluster and pool tags both propagate to DBU usage reports, whether or not the cluster was created from a pool.
If there is a tag name conflict, Databricks default tags take precedence over custom tags and pool tags take precedence over cluster tags.
Limitations
Tag keys and values can contain only characters from the ISO 8859-1 (latin1) set. Tags containing other characters are ignored.
If you change tag keys names or values, these changes apply only after cluster restart or pool expansion.
If the cluster’s custom tags conflict with a pool’s custom tags, the cluster can’t be created.
Enforce mandatory tags
To ensure that certain tags are always populated when clusters are created, you can apply a specific IAM policy to your account’s primary IAM role (the one created during account setup; contact your AWS administrator if you need access). The IAM policy should include explicit Deny statements for mandatory tag keys and optional values. Cluster creation will fail if required tags with one of the allowed values aren’t provided.
For example, if you want to enforce Department
and Project
tags, with only specified values allowed for the former and a free-form non-empty value for the latter, you could apply an IAM policy like this one:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "MandateLaunchWithTag1",
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"ec2:CreateTags"
],
"Resource": "arn:aws:ec2:region:accountId:instance/*",
"Condition": {
"StringNotEqualsIgnoreCase": {
"aws:RequestTag/Department": [
"Deptt1", "Deptt2", "Deptt3"
]
}
}
},
{
"Sid": "MandateLaunchWithTag2",
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"ec2:CreateTags"
],
"Resource": "arn:aws:ec2:region:accountId:instance/*",
"Condition": {
"StringNotLike": {
"aws:RequestTag/Project": "?*"
}
}
}
]
}
Both ec2:RunInstances
and ec2:CreateTags
actions are required for each tag for effective coverage of scenarios in which there are clusters that have only on-demand instances, only spot instances, or both.
Tip
Databricks recommends that you add a separate policy statement for each tag. The overall policy might become long, but it is easier to debug. See the IAM Policy Condition Operators Reference for a list of operators that can be used in a policy.
Cluster creation errors due to an IAM policy show an encoded error message
, starting with:
Cloud Provider Launch Failure: A cloud provider error was encountered while setting up the cluster.
The message is encoded because the details of the authorization status can constitute privileged information that the user who requested the action should not see. See DecodeAuthorizationMessage API (or CLI) for information about how to decode such messages.