Skip to main content

Databricks Asset Bundles resources

Databricks Asset Bundles allows you to specify information about the Databricks resources used by the bundle in the resources mapping in the bundle configuration. See resources mapping and resources key reference.

This article outlines supported resource types for bundles and provides details and an example for each supported type. For additional examples, see Bundle configuration examples.

tip

To generate YAML for any existing resource, use the databricks bundle generate command. See Generate a bundle configuration file.

Supported resources

The following table lists supported resource types for bundles. Some resources can be created by defining them in a bundle and deploying the bundle, and some resources can only be created by referencing an existing asset to include in the bundle.

Resources are defined using the corresponding Databricks REST API object’s create operation request payload, where the object’s supported fields, expressed as YAML, are the resource’s supported properties. Links to documentation for each resource’s corresponding payloads are listed in the table.

tip

The databricks bundle validate command returns warnings if unknown resource properties are found in bundle configuration files.

Resource

Corresponding REST API object

app

App object

cluster

Cluster object

dashboard

Dashboard object

experiment

Experiment object

job

Job object

model (legacy)

Model (legacy) object

model_serving_endpoint

Model serving endpoint object

pipeline

Pipeline object

quality_monitor

Quality monitor object

registered_model (Unity Catalog)

Registered model object

schema (Unity Catalog)

Schema object

volume (Unity Catalog)

Volume object

app

Type: Map

The app resource defines a Databricks app. For information about Databricks Apps, see What is Databricks Apps?.

To add an app, specify the settings to define the app, including the required source_code_path.

tip

You can initialize a bundle with a Streamlit Databricks app using the following command:

databricks bundle init https://github.com/databricks/bundle-examples --template-dir contrib/templates/streamlit-app
YAML
apps:
<app-name>:
<app-field-name>: <app-field-value>

Key

Type

Description

budget_policy_id

String

The budget policy ID for the app.

config

Map

Deprecated. Define your app configuration commands and environment variables in the app.yaml file instead. See Databricks Apps configuration.

description

String

The description of the app.

name

String

The name of the app. The name must contain only lowercase alphanumeric characters and hyphens. It must be unique within the workspace.

permissions

Sequence

The app's permissions. See permissions.

resources

Sequence

The app compute resources. See apps.name.resources.

source_code_path

String

The ./app local path of the Databricks app source code. This field is required.

user_api_scopes

Sequence

The user API scopes.

apps.name.resources

Type: Sequence

The compute resources for the app.

Key

Type

Description

description

String

The description of the app resource.

job

Map

The settings that identify the job resource to use. See resources.job.

name

String

The name of the app resource.

secret

Map

The secret settings. See resources.secret.

serving_endpoint

Map

The settings that identify the serving endpoint resource to use. See resources.serving_endpoint.

sql_warehouse

Map

The settings that identify the warehouse resource to use. See resources.sql_warehouse.

Example

The following example creates an app named my_app that manages a job created by the bundle:

YAML
resources:
jobs:
# Define a job in the bundle
hello_world:
name: hello_world
tasks:
- task_key: task
spark_python_task:
python_file: ../src/main.py
environment_key: default

environments:
- environment_key: default
spec:
client: '1'

# Define an app that manages the job in the bundle
apps:
job_manager:
name: 'job_manager_app'
description: 'An app which manages a job created by this bundle'

# The location of the source code for the app
source_code_path: ../src/app

# The resources in the bundle which this app has access to. This binds the resource in the app with the bundle resource.
resources:
- name: 'app-job'
job:
id: ${resources.jobs.hello_world.id}
permission: 'CAN_MANAGE_RUN'

The corresponding app.yaml defines the configuration for running the app:

YAML
command:
- flask
- --app
- app
- run
- --debug
env:
- name: JOB_ID
valueFrom: 'app-job'

For the complete Databricks app example bundle, see the bundle-examples GitHub repository.

cluster

Type: Map

The cluster resource defines a cluster.

YAML
clusters:
<cluster-name>:
<cluster-field-name>: <cluster-field-value>

Key

Type

Description

apply_policy_default_values

Boolean

When set to true, fixed and default values from the policy will be used for fields that are omitted. When set to false, only fixed values from the policy will be applied.

autoscale

Map

Parameters needed in order to automatically scale clusters up and down based on load. See autoscale.

autotermination_minutes

Integer

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

aws_attributes

Map

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used. See aws_attributes.

azure_attributes

Map

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used. See azure_attributes.

cluster_log_conf

Map

The configuration for delivering spark logs to a long-term storage destination. See cluster_log_conf.

cluster_name

String

Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string.

custom_tags

Map

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. See custom_tags.

data_security_mode

String

The data governance model to use when accessing data from a cluster. See data_security_mode.

docker_image

Map

The custom docker image. See docker_image.

driver_instance_pool_id

String

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

driver_node_type_id

String

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above. This field, along with node_type_id, should not be set if virtual_cluster_size is set. If both driver_node_type_id, node_type_id, and virtual_cluster_size are specified, driver_node_type_id and node_type_id take precedence.

enable_elastic_disk

Boolean

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.

enable_local_disk_encryption

Boolean

Whether to enable LUKS on cluster VMs' local disks

gcp_attributes

Map

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used. See gcp_attributes.

init_scripts

Sequence

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. See init_scripts.

instance_pool_id

String

The optional ID of the instance pool to which the cluster belongs.

is_single_node

Boolean

This field can only be used when kind = CLASSIC_PREVIEW. When set to true, Databricks will automatically set single node related custom_tags, spark_conf, and num_workers

kind

String

The kind of compute described by this compute specification.

node_type_id

String

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

num_workers

Integer

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

permissions

Sequence

The cluster permissions. See permissions.

policy_id

String

The ID of the cluster policy used to create the cluster if applicable.

runtime_engine

String

Determines the cluster's runtime engine, either STANDARD or PHOTON.

single_user_name

String

Single user name if data_security_mode is SINGLE_USER

spark_conf

Map

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively. See spark_conf.

spark_env_vars

Map

An object containing a set of optional, user-specified environment variable key-value pairs.

spark_version

String

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

ssh_public_keys

Sequence

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

use_ml_runtime

Boolean

This field can only be used when kind = CLASSIC_PREVIEW. effective_spark_version is determined by spark_version (DBR release), this field use_ml_runtime, and whether node_type_id is gpu node or not.

workload_type

Map

Cluster Attributes showing for clusters workload types. See workload_type.

Examples

The following example creates a dedicated (single-user) cluster for the current user with Databricks Runtime 15.4 LTS and a cluster policy:

YAML
resources:
clusters:
my_cluster:
num_workers: 0
node_type_id: 'i3.xlarge'
driver_node_type_id: 'i3.xlarge'
spark_version: '15.4.x-scala2.12'
spark_conf:
'spark.executor.memory': '2g'
autotermination_minutes: 60
enable_elastic_disk: true
single_user_name: ${workspace.current_user.userName}
policy_id: '000128DB309672CA'
enable_local_disk_encryption: false
data_security_mode: SINGLE_USER
runtime_engine": STANDARD

This example creates a simple cluster my_cluster and sets that as the cluster to use to run the notebook in my_job:

YAML
bundle:
name: clusters

resources:
clusters:
my_cluster:
num_workers: 2
node_type_id: 'i3.xlarge'
autoscale:
min_workers: 2
max_workers: 7
spark_version: '13.3.x-scala2.12'
spark_conf:
'spark.executor.memory': '2g'

jobs:
my_job:
tasks:
- task_key: test_task
notebook_task:
notebook_path: './src/my_notebook.py'

dashboard

Type: Map

The dashboard resource allows you to manage AI/BI dashboards in a bundle. For information about AI/BI dashboards, see Dashboards.

YAML
dashboards:
<dashboard-name>:
<dashboard-field-name>: <dashboard-field-value>

Key

Type

Description

display_name

String

The display name of the dashboard.

etag

String

The etag for the dashboard. Can be optionally provided on updates to ensure that the dashboard has not been modified since the last read.

file_path

String

The local path of the dashboard asset, including the file name. Exported dashboards always have the file extension .lvdash.json.

permissions

Sequence

The dashboard permissions. See permissions.

serialized_dashboard

Any

The contents of the dashboard in serialized string form.

warehouse_id

String

The warehouse ID used to run the dashboard.

Example

The following example includes and deploys the sample NYC Taxi Trip Analysis dashboard to the Databricks workspace.

YAML
resources:
dashboards:
nyc_taxi_trip_analysis:
display_name: 'NYC Taxi Trip Analysis'
file_path: ../src/nyc_taxi_trip_analysis.lvdash.json
warehouse_id: ${var.warehouse_id}

If you use the UI to modify the dashboard, modifications made through the UI are not applied to the dashboard JSON file in the local bundle unless you explicitly update it using bundle generate. You can use the --watch option to continuously poll and retrieve changes to the dashboard. See Generate a bundle configuration file.

In addition, if you attempt to deploy a bundle that contains a dashboard JSON file that is different than the one in the remote workspace, an error will occur. To force the deploy and overwrite the dashboard in the remote workspace with the local one, use the --force option. See Deploy a bundle.

experiment

Type: Map

The experiment resource allows you to define MLflow experiments in a bundle. For information about MLflow experiments, see Organize training runs with MLflow experiments.

YAML
experiments:
<experiment-name>:
<experiment-field-name>: <experiment-field-value>

Key

Type

Description

artifact_location

String

The location where artifacts for the experiment are stored.

name

String

The friendly name that identifies the experiment.

permissions

Sequence

The experiment's permissions. See permissions.

tags

Sequence

Additional metadata key-value pairs. See tags.

Example

The following example defines an experiment that all users can view:

YAML
resources:
experiments:
experiment:
name: my_ml_experiment
permissions:
- level: CAN_READ
group_name: users
description: MLflow experiment used to track runs

job

Type: Map

The job resource allows you to define jobs and their corresponding tasks in your bundle. For information about jobs, see Orchestration using Databricks Jobs. For a tutorial that uses a Databricks Asset Bundles template to create a job, see Develop a job on Databricks using Databricks Asset Bundles.

YAML
jobs:
<job-name>:
<job-field-name>: <job-field-value>

Key

Type

Description

budget_policy_id

String

The id of the user-specified budget policy to use for this job. If not specified, a default budget policy may be applied when creating or modifying the job. See effective_budget_policy_id for the budget policy used by this workload.

continuous

Map

An optional continuous property for this job. The continuous property will ensure that there is always one run executing. Only one of schedule and continuous can be used. See continuous.

deployment

Map

Deployment information for jobs managed by external sources. See deployment.

description

String

An optional description for the job. The maximum length is 27700 characters in UTF-8 encoding.

edit_mode

String

Edit mode of the job, either UI_LOCKED or EDITABLE.

email_notifications

Map

An optional set of email addresses that is notified when runs of this job begin or complete as well as when this job is deleted. See email_notifications.

environments

Sequence

A list of task execution environment specifications that can be referenced by serverless tasks of this job. An environment is required to be present for serverless tasks. For serverless notebook tasks, the environment is accessible in the notebook environment panel. For other serverless tasks, the task environment is required to be specified using environment_key in the task settings. See environments.

format

String

The format of the job.

git_source

Map

An optional specification for a remote Git repository containing the source code used by tasks. The git_source field and task source field set to GIT are not recommended for bundles, because local relative paths may not point to the same content in the Git repository, and bundles expect that a deployed job has the same content as the local copy from where it was deployed. Instead, clone the repository locally and set up your bundle project within this repository, so that the source for tasks are the workspace.

health

Map

An optional set of health rules that can be defined for this job. See health.

job_clusters

Sequence

A list of job cluster specifications that can be shared and reused by tasks of this job. See clusters.

max_concurrent_runs

Integer

An optional maximum allowed number of concurrent runs of the job. Set this value if you want to be able to execute multiple runs of the same job concurrently. See max_concurrent_runs.

name

String

An optional name for the job. The maximum length is 4096 bytes in UTF-8 encoding.

notification_settings

Map

Optional notification settings that are used when sending notifications to each of the email_notifications and webhook_notifications for this job. See notification_settings.

parameters

Sequence

Job-level parameter definitions. See parameters.

performance_target

String

PerformanceTarget defines how performant or cost efficient the execution of run on serverless should be.

permissions

Sequence

The job's permissions. See permissions.

queue

Map

The queue settings of the job. See queue.

run_as

Map

Write-only setting. Specifies the user or service principal that the job runs as. If not specified, the job runs as the user who created the job. Either user_name or service_principal_name should be specified. If not, an error is thrown. See Specify a run identity for a Databricks Asset Bundles workflow.

schedule

Map

An optional periodic schedule for this job. The default behavior is that the job only runs when triggered by clicking “Run Now” in the Jobs UI or sending an API request to runNow. See schedule.

tags

Map

A map of tags associated with the job. These are forwarded to the cluster as cluster tags for jobs clusters, and are subject to the same limitations as cluster tags. A maximum of 25 tags can be added to the job.

tasks

Sequence

A list of task specifications to be executed by this job. See Add tasks to jobs in Databricks Asset Bundles.

timeout_seconds

Integer

An optional timeout applied to each run of this job. A value of 0 means no timeout.

trigger

Map

A configuration to trigger a run when certain conditions are met. See trigger.

webhook_notifications

Map

A collection of system notification IDs to notify when runs of this job begin or complete. See webhook_notifications.

Example

The following example defines a job with the resource key hello-job with one notebook task:

YAML
resources:
jobs:
hello-job:
name: hello-job
tasks:
- task_key: hello-task
notebook_task:
notebook_path: ./hello.py

For information about defining job tasks and overriding job settings, see Add tasks to jobs in Databricks Asset Bundles, Override job tasks settings in Databricks Asset Bundles, and Override cluster settings in Databricks Asset Bundles.

important

The job git_source field and task source field set to GIT are not recommended for bundles, because local relative paths may not point to the same content in the Git repository, and bundles expect that a deployed job has the same content as the local copy from where it was deployed.

Instead, clone the repository locally and set up your bundle project within this repository, so that the source for tasks are the workspace.

model (legacy)

Type: Map

The model resource allows you to define legacy models in bundles. Databricks recommends you use Unity Catalog registered models instead.

model_serving_endpoint

Type: Map

The model_serving_endpoint resource allows you to define model serving endpoints. See Manage model serving endpoints.

YAML
model_serving_endpoints:
<model_serving_endpoint-name>:
<model_serving_endpoint-field-name>: <model_serving_endpoint-field-value>

Key

Type

Description

ai_gateway

Map

The AI Gateway configuration for the serving endpoint. NOTE: Only external model and provisioned throughput endpoints are currently supported. See ai_gateway.

config

Map

The core config of the serving endpoint. See config.

name

String

The name of the serving endpoint. This field is required and must be unique across a Databricks workspace. An endpoint name can consist of alphanumeric characters, dashes, and underscores.

permissions

Sequence

The model serving endpoint's permissions. See permissions.

rate_limits

Sequence

Rate limits to be applied to the serving endpoint. NOTE: this field is deprecated, please use AI Gateway to manage rate limits. See rate_limits.

route_optimized

Boolean

Enable route optimization for the serving endpoint.

tags

Sequence

Tags to be attached to the serving endpoint and automatically propagated to billing logs. See tags.

Example

The following example defines a Unity Catalog model serving endpoint:

YAML
resources:
model_serving_endpoints:
uc_model_serving_endpoint:
name: 'uc-model-endpoint'
config:
served_entities:
- entity_name: 'myCatalog.mySchema.my-ads-model'
entity_version: '10'
workload_size: 'Small'
scale_to_zero_enabled: 'true'
traffic_config:
routes:
- served_model_name: 'my-ads-model-10'
traffic_percentage: '100'
tags:
- key: 'team'
value: 'data science'

pipeline

Type: Map

The pipeline resource allows you to create DLT pipelines. For information about pipelines, see What is DLT?. For a tutorial that uses the Databricks Asset Bundles template to create a pipeline, see Develop DLT pipelines with Databricks Asset Bundles.

YAML
pipelines:
<pipeline-name>:
<pipeline-field-name>: <pipeline-field-value>

Key

Type

Description

allow_duplicate_names

Boolean

If false, deployment will fail if name conflicts with that of another pipeline.

catalog

String

A catalog in Unity Catalog to publish data from this pipeline to. If target is specified, tables in this pipeline are published to a target schema inside catalog (for example, catalog.target.table). If target is not specified, no data is published to Unity Catalog.

channel

String

The DLT Release Channel that specifies which version of DLT to use.

clusters

Sequence

The cluster settings for this pipeline deployment. See cluster.

configuration

Map

The configuration for this pipeline execution.

continuous

Boolean

Whether the pipeline is continuous or triggered. This replaces trigger.

deployment

Map

Deployment type of this pipeline. See deployment.

development

Boolean

Whether the pipeline is in development mode. Defaults to false.

dry_run

Boolean

Whether the pipeline is a dry run pipeline.

edition

String

The pipeline product edition.

event_log

Map

The event log configuration for this pipeline. See event_log.

filters

Map

The filters that determine which pipeline packages to include in the deployed graph. See filters.

id

String

Unique identifier for this pipeline.

ingestion_definition

Map

The configuration for a managed ingestion pipeline. These settings cannot be used with the libraries, schema, target, or catalog settings. See ingestion_definition.

libraries

Sequence

Libraries or code needed by this deployment. See libraries.

name

String

A friendly name for this pipeline.

notifications

Sequence

The notification settings for this pipeline. See notifications.

permissions

Sequence

The pipeline's permissions. See permissions.

photon

Boolean

Whether Photon is enabled for this pipeline.

schema

String

The default schema (database) where tables are read from or published to.

serverless

Boolean

Whether serverless compute is enabled for this pipeline.

storage

String

The DBFS root directory for storing checkpoints and tables.

target

String

Target schema (database) to add tables in this pipeline to. Exactly one of schema or target must be specified. To publish to Unity Catalog, also specify catalog. This legacy field is deprecated for pipeline creation in favor of the schema field.

trigger

Map

Deprecated. Which pipeline trigger to use. Use continuous instead.

Example

The following example defines a pipeline with the resource key hello-pipeline:

YAML
resources:
pipelines:
hello-pipeline:
name: hello-pipeline
clusters:
- label: default
num_workers: 1
development: true
continuous: false
channel: CURRENT
edition: CORE
photon: false
libraries:
- notebook:
path: ./pipeline.py

quality_monitor (Unity Catalog)

Type: Map

The quality_monitor resource allows you to define a Unity Catalog table monitor. For information about monitors, see Introduction to Databricks Lakehouse Monitoring.

YAML
quality_monitors:
<quality_monitor-name>:
<quality_monitor-field-name>: <quality_monitor-field-value>

Key

Type

Description

assets_dir

String

The directory to store monitoring assets (e.g. dashboard, metric tables).

baseline_table_name

String

Name of the baseline table from which drift metrics are computed from. Columns in the monitored table should also be present in the baseline table.

custom_metrics

Sequence

Custom metrics to compute on the monitored table. These can be aggregate metrics, derived metrics (from already computed aggregate metrics), or drift metrics (comparing metrics across time windows). See custom_metrics.

inference_log

Map

Configuration for monitoring inference logs. See inference_log.

notifications

Map

The notification settings for the monitor. See notifications.

output_schema_name

String

Schema where output metric tables are created.

schedule

Map

The schedule for automatically updating and refreshing metric tables. See schedule.

skip_builtin_dashboard

Boolean

Whether to skip creating a default dashboard summarizing data quality metrics.

slicing_exprs

Sequence

List of column expressions to slice data with for targeted analysis. The data is grouped by each expression independently, resulting in a separate slice for each predicate and its complements. For high-cardinality columns, only the top 100 unique values by frequency will generate slices.

snapshot

Map

Configuration for monitoring snapshot tables.

table_name

String

The full name of the table.

time_series

Map

Configuration for monitoring time series tables. See time_series.

warehouse_id

String

Optional argument to specify the warehouse for dashboard creation. If not specified, the first running warehouse will be used.

Examples

For a complete example bundle that defines a quality_monitor, see the mlops_demo bundle.

The following examples define quality monitors for InferenceLog, TimeSeries, and Snapshot profile types.

YAML
# InferenceLog profile type
resources:
quality_monitors:
my_quality_monitor:
table_name: dev.mlops_schema.predictions
output_schema_name: ${bundle.target}.mlops_schema
assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
inference_log:
granularities: [1 day]
model_id_col: model_id
prediction_col: prediction
label_col: price
problem_type: PROBLEM_TYPE_REGRESSION
timestamp_col: timestamp
schedule:
quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
timezone_id: UTC
YAML
# TimeSeries profile type
resources:
quality_monitors:
my_quality_monitor:
table_name: dev.mlops_schema.predictions
output_schema_name: ${bundle.target}.mlops_schema
assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
time_series:
granularities: [30 minutes]
timestamp_col: timestamp
schedule:
quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
timezone_id: UTC
YAML
# Snapshot profile type
resources:
quality_monitors:
my_quality_monitor:
table_name: dev.mlops_schema.predictions
output_schema_name: ${bundle.target}.mlops_schema
assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
snapshot: {}
schedule:
quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
timezone_id: UTC

registered_model (Unity Catalog)

Type: Map

The registered model resource allows you to define models in Unity Catalog. For information about Unity Catalog registered models, see Manage model lifecycle in Unity Catalog.

YAML
registered_models:
<registered_model-name>:
<registered_model-field-name>: <registered_model-field-value>

Key

Type

Description

catalog_name

String

The name of the catalog where the schema and the registered model reside.

comment

String

The comment attached to the registered model.

grants

Sequence

The grants associated with the registered model. See grants.

name

String

The name of the registered model.

schema_name

String

The name of the schema where the registered model resides.

storage_location

String

The storage location on the cloud under which model version data files are stored.

Example

The following example defines a registered model in Unity Catalog:

YAML
resources:
registered_models:
model:
name: my_model
catalog_name: ${bundle.target}
schema_name: mlops_schema
comment: Registered model in Unity Catalog for ${bundle.target} deployment target
grants:
- privileges:
- EXECUTE
principal: account users

schema (Unity Catalog)

Type: Map

The schema resource type allows you to define Unity Catalog schemas for tables and other assets in your workflows and pipelines created as part of a bundle. A schema, different from other resource types, has the following limitations:

  • The owner of a schema resource is always the deployment user, and cannot be changed. If run_as is specified in the bundle, it will be ignored by operations on the schema.
  • Only fields supported by the corresponding Schemas object create API are available for the schema resource. For example, enable_predictive_optimization is not supported as it is only available on the update API.
YAML
schemas:
<schema-name>:
<schema-field-name>: <schema-field-value>

Key

Type

Description

catalog_name

String

The name of the parent catalog.

comment

String

A user-provided free-form text description.

grants

Sequence

The grants associated with the schema. See grants.

name

String

The name of schema, relative to the parent catalog.

properties

Map

A map of key-value properties attached to the schema.

storage_root

String

The storage root URL for managed tables within the schema.

Examples

The following example defines a pipeline with the resource key my_pipeline that creates a Unity Catalog schema with the key my_schema as the target:

YAML
resources:
pipelines:
my_pipeline:
name: test-pipeline-{{.unique_id}}
libraries:
- notebook:
path: ./nb.sql
development: true
catalog: main
target: ${resources.schemas.my_schema.id}

schemas:
my_schema:
name: test-schema-{{.unique_id}}
catalog_name: main
comment: This schema was created by DABs.

A top-level grants mapping is not supported by Databricks Asset Bundles, so if you want to set grants for a schema, define the grants for the schema within the schemas mapping. For more information about grants, see Show, grant, and revoke privileges.

The following example defines a Unity Catalog schema with grants:

YAML
resources:
schemas:
my_schema:
name: test-schema
grants:
- principal: users
privileges:
- SELECT
- principal: my_team
privileges:
- CAN_MANAGE
catalog_name: main

volume (Unity Catalog)

Type: Map

The volume resource type allows you to define and create Unity Catalog volumes as part of a bundle. When deploying a bundle with a volume defined, note that:

  • A volume cannot be referenced in the artifact_path for the bundle until it exists in the workspace. Hence, if you want to use Databricks Asset Bundles to create the volume, you must first define the volume in the bundle, deploy it to create the volume, then reference it in the artifact_path in subsequent deployments.
  • Volumes in the bundle are not prepended with the dev_${workspace.current_user.short_name} prefix when the deployment target has mode: development configured. However, you can manually configure this prefix. See Custom presets.
YAML
volumes:
<volume-name>:
<volume-field-name>: <volume-field-value>

Key

Type

Description

catalog_name

String

The name of the catalog of the schema and volume.

comment

String

The comment attached to the volume.

grants

Sequence

The grants associated with the volume. See grants.

name

String

The name of the volume.

schema_name

String

The name of the schema where the volume is.

storage_location

String

The storage location on the cloud.

volume_type

String

The volume type, either EXTERNAL or MANAGED. An external volume is located in the specified external location. A managed volume is located in the default location which is specified by the parent schema, or the parent catalog, or the metastore.

Example

The following example creates a Unity Catalog volume with the key my_volume:

YAML
resources:
volumes:
my_volume:
catalog_name: main
name: my_volume
schema_name: my_schema

For an example bundle that runs a job that writes to a file in Unity Catalog volume, see the bundle-examples GitHub repository.

Common objects

grants

Type: Sequence

Key

Type

Description

principal

String

The name of the principal that will be granted privileges.

privileges

Sequence

The privileges to grant to the specified entity.