`bundle` command group

note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The bundle command group within the Databricks CLI enables you to programmatically validate, deploy, and run Databricks workflows such as Databricks jobs, Lakeflow Declarative Pipelines, and MLOps Stacks. See What are Databricks Asset Bundles?.

You run bundle commands by appending them to databricks bundle. To display help for the bundle command, run databricks bundle -h.

Create a bundle from a project template

To create a Databricks Asset Bundle using the default Databricks Asset Bundle template for Python, run the bundle init command as follows, and then answer the on-screen prompts:

Bash

databricks bundle init

To create a Databricks Asset Bundle using a custom Databricks Asset Bundle template, run the bundle init command as follows:

Bash
databricks bundle init <project-template-local-path-or-url> \
--project-dir="</local/path/to/project/template/output>"

Display the bundle configuration schema

To display the bundle configuration schema, run the bundle schema command, as follows:

Bash

databricks bundle schema

To output the Databricks Asset Bundle configuration schema as a JSON file, run the bundle schema command and redirect the output to a JSON file. For example, you can generate a file named bundle_config_schema.json within the current directory, as follows:

Bash
databricks bundle schema > bundle_config_schema.json

Validate a bundle

To validate that your bundle configuration files are syntactically correct, run the bundle validate command from the bundle project root, as follows:

Bash

databricks bundle validate

By default this command returns a summary of the bundle identity:

Output
Name: MyBundle
Target: dev
Workspace:
  Host: https://my-host.cloud.databricks.com
  User: someone@example.com
  Path: /Users/someone@example.com/.bundle/MyBundle/dev

Validation OK!

note

The bundle validate command outputs warnings if resource properties are defined in the bundle configuration files that are not found in the corresponding object's schema.

If you only want to output a summary of the bundle's identity and resources, use bundle summary.

Sync a bundle's tree to a workspace

The bundle sync command performs a one-way synchronization of a bundle's file changes within a local filesystem directory, to a directory within a remote Databricks workspace.

note

bundle sync commands cannot synchronize file changes from a directory within a remote Databricks workspace, back to a directory within a local filesystem.

databricks bundle sync commands work in the same way as databricks sync commands and are provided as a productivity convenience. For command usage information, see sync command group.

Generate a bundle configuration file

The bundle generate command generates configuration for a resource that already exists in your Databricks workspace. The following resources are supported:

By default, this command generates a *.yml file for the resource in the resources folder of the bundle project and also downloads any files, such as notebooks, referenced in the configuration.

important

The bundle generate command is provided as a convenience to autogenerate resource configuration. However, when resource configuration is included in the bundle and deployed, it creates a new resource and does not update the existing resource unless bundle deployment bind has first been used. See Bind a bundle resource.

Generate app configuration

To generate configuration for an existing app in the workspace, run bundle generate app, specifying the name of the app in the workspace:

Bash
databricks bundle generate app --existing-app-name [app-name]

You can get the app name from the Compute > Apps tab of the workspace UI.

For example, the following command generates a new hello_world.app.yml file in the resources bundle project folder, and downloads the app's code files, such as the app's command configuration file app.yaml and main app.py. By default, the code files are copied to the bundle's src folder.

Bash

databricks bundle generate app --existing-app-name "hello_world"

YAML
# This is the contents of the resulting /resources/hello-world.app.yml file.
resources:
  apps:
    hello_world:
      name: hello-world
      description: A basic starter application.
      source_code_path: ../src/app

Generate dashboard configuration

To generate configuration for an existing dashboard in the workspace, run bundle generate dashboard, specifying either the ID or workspace path for the dashboard:

Bash
databricks bundle generate dashboard --existing-id [dashboard-id]

Bash
databricks bundle generate dashboard --existing-path [dashboard-workspace-path]

You can copy the workspace path for a dashboard from the workspace UI.

For example, the following command generates a new baby_gender_by_county.dashboard.yml file in the resources bundle project folder containing the YAML below, and downloads the baby_gender_by_county.lvdash.json file to the src project folder.

Bash

databricks bundle generate dashboard --existing-path "/Workspace/Users/someone@example.com/baby_gender_by_county.lvdash.json"

YAML
# This is the contents of the resulting baby_gender_by_county.dashboard.yml file.
resources:
  dashboards:
    baby_gender_by_county:
      display_name: 'Baby gender by county'
      warehouse_id: aae11o8e6fe9zz79
      file_path: ../src/baby_gender_by_county.lvdash.json

tip

To update the .lvdash.json file after you have already deployed a dashboard, use the --resource option when you run bundle generate dashboard to generate that file for the existing dashboard resource. To continuously poll and retrieve updates to a dashboard, use the --force and --watch options.

Generate job or pipeline configuration

To generate configuration for a job or pipeline, run the bundle generate job or bundle generate pipeline command:

Bash
databricks bundle generate [job|pipeline] --existing-[job|pipeline]-id [job-id|pipeline-id]

note

Currently, only jobs with notebook tasks are supported by this command.

For example, the following command generates a new hello_job.yml file in the resources bundle project folder containing the YAML below, and downloads the simple_notebook.py to the src project folder.

Bash

databricks bundle generate job --existing-job-id 6565621249

YAML
# This is the contents of the resulting hello_job.yml file.
resources:
  jobs:
    hello_job:
      name: 'Hello Job'
      tasks:
        - task_key: run_notebook
          email_notifications: {}
          notebook_task:
            notebook_path: ../src/simple_notebook.py
            source: WORKSPACE
          run_if: ALL_SUCCESS
      max_concurrent_runs: 1

Bind a bundle resource

The bundle deployment bind command allows you to link bundle-defined resources to existing resources in the Databricks workspace so that they become managed by Databricks Asset Bundles. If you bind a resource, the existing Databricks resource in the workspace is updated based on the configuration defined in the bundle it is bound to after the next bundle deploy.

Bash
databricks bundle deployment bind [resource-key] [resource-id]

Bind does not recreate data. For example, if a pipeline with data in a catalog had bind applied, you can deploy to that pipeline without losing the existing data. In addition, you do not need to recompute the Materialized view, for example, so pipelines do not have to rerun.

The bind command should be used with the --target flag. For example, bind your production deployment to your production pipeline using databricks bundle deployment bind --target prod my_pipeline 7668611149d5709ac9-2906-1229-9956-586a9zed8929

tip

It's a good idea to confirm the resource in the workspace before running bind.

Bind is supported for the following resources:

The following command binds the resource hello_job to its remote counterpart in the workspace. The command outputs a diff and allows you to deny the resource binding, but if confirmed, any updates to the job definition in the bundle are applied to the corresponding remote job when the bundle is next deployed.

Bash
databricks bundle deployment bind hello_job 6565621249

Unbind a bundle resource

If you want to remove the link between the resource in a bundle and its remote counterpart in a workspace, use bundle deployment unbind:

Bash
databricks bundle deployment unbind [resource-key]

For example, to unbind the hello_job resource:

Bash

databricks bundle deployment unbind hello_job

Output a bundle summary

The bundle summary command outputs a summary of a bundle's identity and resources, including deep links for resources so that you can easily navigate to the resource in the Databricks workspace.

Bash

databricks bundle summary

The following example output is the summary of a bundle named my_pipeline_bundle that defines a job and a pipeline:

Name: my_pipeline_bundle
Target: dev
Workspace:
  Host: https://myworkspace.cloud.databricks.com
  User: someone@example.com
  Path: /Users/someone@example.com/.bundle/my_pipeline/dev
Resources:
  Jobs:
    my_project_job:
      Name: [dev someone] my_project_job
      URL:  https://myworkspace.cloud.databricks.com/jobs/206000809187888?o=6051000018419999
  Pipelines:
    my_project_pipeline:
      Name: [dev someone] my_project_pipeline
      URL:  https://myworkspace.cloud.databricks.com/pipelines/7f559fd5-zztz-47fa-aa5c-c6bf034b4f58?o=6051000018419999

tip

You can also use bundle open to navigate to a resource in the Databricks workspace. See Open a resource in the workspace.

Deploy a bundle

To deploy a bundle to the remote workspace, run the bundle deploy command from the bundle project root. If no command options are specified, the default target as declared within the bundle configuration files is used.

Bash

databricks bundle deploy

To deploy the bundle to a specific target, set the -t (or --target) option along with the target's name as declared within the bundle configuration files. For example, for a target declared with the name dev:

Bash
databricks bundle deploy -t dev

A bundle can be deployed to multiple workspaces, such as development, staging, and production workspaces. Fundamentally, the root_path property is what determines a bundle's unique identity, which defaults to ~/.bundle/${bundle.name}/${bundle.target}. Therefore by default, a bundle's identity is comprised of the identity of the deployer, the bundle's name, and the bundle's target name. If these are identical across different bundles, deployment of these bundles will interfere with one another.

Furthermore, a bundle deployment tracks the resources it creates in the target workspace by their IDs as a state that is stored in the workspace file system. Resource names are not used to correlate between a bundle deployment and a resource instance, so:

If a resource in the bundle configuration does not exist in the target workspace, it is created.
If a resource in the bundle configuration exists in the target workspace, it is updated in the workspace.
If a resource is removed from the bundle configuration, it is removed from the target workspace if it was previously deployed.
A resource's association with a bundle can only be forgotten if you change the bundle name, the bundle target, or the workspace. You can run bundle validate to output a summary containing these values.

Run a job or pipeline

To run a specific job or pipeline, use the bundle run command. You must specify the resource key of the job or pipeline declared within the bundle configuration files. By default, the environment declared within the bundle configuration files is used. For example, to run a job hello_job in the default environment, run the following command:

Bash

databricks bundle run hello_job

To run a job with a key hello_job within the context of a target declared with the name dev:

Bash
databricks bundle run -t dev hello_job

Validate a pipeline

If you want to do a pipeline validation run, use the --validate-only option, as shown in the following example:

Bash

databricks bundle run --validate-only my_pipeline

Pass job parameters

To pass job parameters, use the --params option, followed by comma-separated key-value pairs, where the key is the parameter name. For example, the following command sets the parameter with the name message to HelloWorld for the job hello_job:

Bash
databricks bundle run --params message=HelloWorld hello_job

note

You can pass parameters to job tasks using the job task options, but the --params option is the recommended method for passing job parameters. An error occurs if job parameters are specified for a job that doesn't have job parameters defined or if task parameters are specified for a job that has job parameters defined.

Cancel a run

To cancel and restart an existing job run or pipeline update, use the --restart option:

Bash
databricks bundle run --restart hello_job

Execute scripts

Append -- (double hyphen) after bundle run to execute scripts with the bundle's configured authentication credentials. For example, the following command outputs the current user's current working directory:

Bash
databricks bundle run -- python3 -c 'import os; print(os.getcwd())'

Bundle authentication information is passed to child processes using environment variables. See Databricks client unified authentication.

Open a resource in the workspace

To navigate to a bundle resource in the workspace, run the bundle open command from the bundle project root, specifying the resource to open. If a resource key is not specified, this command outputs a list of the bundle's resources from which to choose.

Bash
databricks bundle open [resource-key]

For example, the following command launches a browser and navigates to the baby_gender_by_county dashboard in the bundle in the Databricks workspace that is configured for the bundle:

Bash
databricks bundle open baby_gender_by_county

Destroy a bundle

warning

Destroying a bundle permanently deletes a bundle's previously-deployed jobs, pipelines, and artifacts. This action cannot be undone.

To delete jobs, pipelines, and artifacts that were previously deployed, run the bundle destroy command. The following command deletes all previously-deployed jobs, pipelines, and artifacts that are defined in the bundle configuration files:

Bash

databricks bundle destroy

note

A bundle's identity is comprised of the bundle name, the bundle target, and the workspace. If you have changed any of these and then attempt to destroy a bundle prior to deploying, an error will occur.

By default, you are prompted to confirm permanent deletion of the previously-deployed jobs, pipelines, and artifacts. To skip these prompts and perform automatic permanent deletion, add the --auto-approve option to the bundle destroy command.

Create a bundle from a project template​

Display the bundle configuration schema​

Validate a bundle​

Sync a bundle's tree to a workspace​

Generate a bundle configuration file​

Generate app configuration​

Generate dashboard configuration​

Generate job or pipeline configuration​

Bind a bundle resource​

Unbind a bundle resource​

Output a bundle summary​

Deploy a bundle​

Run a job or pipeline​

Validate a pipeline​

Pass job parameters​

Cancel a run​

Execute scripts​

Open a resource in the workspace​

Destroy a bundle​

Create a bundle from a project template

Display the bundle configuration schema

Validate a bundle

Sync a bundle's tree to a workspace

Generate a bundle configuration file

Generate app configuration

Generate dashboard configuration

Generate job or pipeline configuration

Bind a bundle resource

Unbind a bundle resource

Output a bundle summary

Deploy a bundle

Run a job or pipeline

Validate a pipeline

Pass job parameters

Cancel a run

Execute scripts

Open a resource in the workspace

Destroy a bundle