Override with target settings
This page describes how to override or join top-level settings with target settings in Databricks Asset Bundles. For information about bundle settings, see Databricks Asset Bundle configuration.
Artifact settings override
You can override the artifact settings in a top-level artifacts
mapping with the artifact settings in a targets
mapping, for example:
# ...
artifacts:
<some-unique-programmatic-identifier-for-this-artifact>:
# Artifact settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
artifacts:
<the-matching-programmatic-identifier-for-this-artifact>:
# Any more artifact settings to join with the settings from the
# matching top-level artifacts mapping.
If any artifact setting is defined both in the top-level artifacts
mapping and the targets
mapping for the same artifact, then the setting in the targets
mapping takes precedence over the setting in the top-level artifacts
mapping.
Example 1: Artifact settings defined only in the top-level artifacts mapping
To demonstrate how this works in practice, in the following example, path
is defined in the top-level artifacts
mapping, which defines all of the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
# ...
When you run databricks bundle validate
for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_package",
"...": "..."
}
},
"...": "..."
}
Example 2: Conflicting artifact settings defined in multiple artifact mappings
In this example, path
is defined both in the top-level artifacts
mapping and in the artifacts
mapping in targets
. In this example, path
in the artifacts
mapping in targets
takes precedence over path
in the top-level artifacts
mapping, to define the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
targets:
dev:
artifacts:
my-artifact:
path: ./my_other_package
# ...
When you run databricks bundle validate
for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_other_package",
"...": "..."
}
},
"...": "..."
}
Cluster settings overrides
You can override or join the job or pipeline cluster settings for a target.
For jobs, use job_cluster_key
within a job definition to identify job cluster settings in the top-level resources
mapping to join with job cluster settings in a targets
mapping:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <some-unique-programmatic-identifier-for-this-key>
new_cluster:
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <the-matching-programmatic-identifier-for-this-key>
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level job_cluster_key.
# ...
If any cluster setting is defined both in the top-level resources
mapping and the targets
mapping for the same job_cluster_key
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
For Lakeflow Declarative Pipelines, use label
within the cluster settings of a pipeline definition to identify cluster settings in a top-level resources
mapping to join with the cluster settings in a targets
mapping, for example:
# ...
resources:
pipelines:
<some-unique-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
pipelines:
<the-matching-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level label.
# ...
If any cluster setting is defined both in the top-level resources
mapping and the targets
mapping for the same label
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
Example 1: New job cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version
in the top-level resources
mapping is combined with node_type_id
and num_workers
in the resources
mapping in targets
to define the settings for the job_cluster_key
named my-cluster
:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
node_type_id: n2-highmem-4
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "n2-highmem-4",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting new job cluster settings defined in multiple resource mappings
In this example, spark_version
, and num_workers
are defined both in the top-level resources
mapping and in the resources
mapping in targets
. In this example, spark_version
and num_workers
in the resources
mapping in targets
take precedence over spark_version
and num_workers
in the top-level resources
mapping, to define the settings for the job_cluster_key
named my-cluster
:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: n2-highmem-4
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "n2-highmem-4",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 3: Pipeline cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, node_type_id
in the top-level resources
mapping is combined with num_workers
in the resources
mapping in targets
to define the settings for the label
named default
:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: n2-highmem-4
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "n2-highmem-4",
"num_workers": 1
}
],
"...": "..."
}
}
}
}
Example 4: Conflicting pipeline cluster settings defined in multiple resource mappings
In this example, num_workers
is defined both in the top-level resources
mapping and in the resources
mapping in targets
. num_workers
in the resources
mapping in targets
take precedence over num_workers
in the top-level resources
mapping, to define the settings for the label
named default
:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: n2-highmem-4
num_workers: 1
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "n2-highmem-4",
"num_workers": 2
}
],
"...": "..."
}
}
}
}
Job task settings override
You can use the tasks
mapping within a job definition to join the job tasks settings in a top-level resources
mapping with the job task settings in a targets
mapping, for example:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <some-unique-programmatic-identifier-for-this-task>
# Task settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <the-matching-programmatic-identifier-for-this-key>
# Any more task settings to join with the settings from the
# resources mapping for the matching top-level task_key.
# ...
To join the top-level resources
mapping and the targets
mapping for the same task, the task mappings' task_key
must be set to the same value.
If any job task setting is defined both in the top-level resources
mapping and the targets
mapping for the same task
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
Example 1: Job task settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version
in the top-level resources
mapping is combined with node_type_id
and num_workers
in the resources
mapping in targets
to define the settings for the task_key
named my-task
:
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-key
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
node_type_id: n2-highmem-4
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows (ellipses indicate omitted content, for brevity):
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "n2-highmem-4",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
},
"task-key": "my-task"
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting job task settings defined in multiple resource mappings
In this example, spark_version
, and num_workers
are defined both in the top-level resources
mapping and in the resources
mapping in targets
. spark_version
and num_workers
in the resources
mapping in targets
take precedence over spark_version
and num_workers
in the top-level resources
mapping. This defines the settings for the task_key
named my-task
(ellipses indicate omitted content, for brevity):
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: n2-highmem-4
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "n2-highmem-4",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
},
"task_key": "my-task"
}
],
"...": "..."
}
}
}
}