Skip to main content

bundle command group

note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The bundle command group within the Databricks CLI contains commands for managing Databricks Asset Bundles. Databricks Asset Bundles let you express projects as code and programmatically validate, deploy, and run Databricks workflows such as Databricks jobs, Lakeflow Declarative Pipelines, and MLOps Stacks. See What are Databricks Asset Bundles?.

databricks bundle deploy

Deploy a bundle to the remote workspace.

databricks bundle deploy [flags]

Bundle target and identity

To deploy the bundle to a specific target, set the -t (or --target) option along with the target's name as declared within the bundle configuration files. If no command options are specified, the default target as declared within the bundle configuration files is used. For example, for a target declared with the name dev:

Bash
databricks bundle deploy -t dev

A bundle can be deployed to multiple workspaces, such as development, staging, and production workspaces. Fundamentally, the root_path property is what determines a bundle's unique identity, which defaults to ~/.bundle/${bundle.name}/${bundle.target}. Therefore by default, a bundle's identity is comprised of the identity of the deployer, the bundle's name, and the bundle's target name. If these are identical across different bundles, deployment of these bundles will interfere with one another.

Furthermore, a bundle deployment tracks the resources it creates in the target workspace by their IDs as a state that is stored in the workspace file system. Resource names are not used to correlate between a bundle deployment and a resource instance, so:

  • If a resource in the bundle configuration does not exist in the target workspace, it is created.
  • If a resource in the bundle configuration exists in the target workspace, it is updated in the workspace.
  • If a resource is removed from the bundle configuration, it is removed from the target workspace if it was previously deployed.
  • A resource's association with a bundle can only be forgotten if you change the bundle name, the bundle target, or the workspace. You can run bundle validate to output a summary containing these values.

Options

--auto-approve

    Skip interactive approvals that might be required for deployment.

-c, --cluster-id string

    Override cluster in the deployment with the given cluster ID.

--fail-on-active-runs

    Fail if there are running jobs or pipelines in the deployment.

--force

    Force-override Git branch validation.

--force-lock

    Force acquisition of deployment lock.

Global flags

Examples

The following example deploys a bundle using a specific cluster ID:

Bash
databricks bundle deploy --cluster-id 0123-456789-abcdef

databricks bundle deployment

Deployment-related commands.

databricks bundle deployment [command]

Available Commands

  • bind - Bind a bundle-defined resource to an existing resource in the remote workspace.
  • unbind - Unbind a bundle-defined resource from its remote resource.

databricks bundle deployment bind

Link bundle-defined resources to existing resources in the Databricks workspace so that they become managed by Databricks Asset Bundles. If you bind a resource, the existing Databricks resource in the workspace is updated based on the configuration defined in the bundle it is bound to after the next bundle deploy.

databricks bundle deployment bind KEY RESOURCE_ID [flags]

Bind does not recreate data. For example, if a pipeline with data in a catalog had bind applied, you can deploy to that pipeline without losing the existing data. In addition, you do not need to recompute the Materialized view, for example, so pipelines do not have to rerun.

The bind command should be used with the --target flag. For example, bind your production deployment to your production pipeline using databricks bundle deployment bind --target prod my_pipeline 7668611149d5709ac9-2906-1229-9956-586a9zed8929

tip

It's a good idea to confirm the resource in the workspace before running bind.

Bind is supported for the following resources:

Arguments

KEY

    The resource key to bind

RESOURCE_ID

    The ID of the existing resource to bind to

Options

--auto-approve

    Automatically approve the binding, instead of prompting

--force-lock

    Force acquisition of deployment lock.

Global flags

Examples

The following command binds the resource hello_job to its remote counterpart in the workspace. The command outputs a diff and allows you to deny the resource binding, but if confirmed, any updates to the job definition in the bundle are applied to the corresponding remote job when the bundle is next deployed.

Bash
databricks bundle deployment bind hello_job 6565621249

databricks bundle deployment unbind

Remove the link between the resource in a bundle and its remote counterpart in a workspace.

databricks bundle deployment unbind KEY [flags]

Arguments

KEY

    The resource key to unbind

Options

--force-lock

    Force acquisition of deployment lock.

Global flags

Examples

The following example unbinds the hello_job resource:

Bash
databricks bundle deployment unbind hello_job

databricks bundle destroy

warning

Destroying a bundle permanently deletes a bundle's previously-deployed jobs, pipelines, and artifacts. This action cannot be undone.

Delete jobs, pipelines, other resources, and artifacts that were previously deployed.

databricks bundle destroy [flags]
note

A bundle's identity is comprised of the bundle name, the bundle target, and the workspace. If you have changed any of these and then attempt to destroy a bundle prior to deploying, an error will occur.

By default, you are prompted to confirm permanent deletion of the previously-deployed jobs, pipelines, and artifacts. To skip these prompts and perform automatic permanent deletion, add the --auto-approve option to the bundle destroy command.

Options

--auto-approve

    Skip interactive approvals for deleting resources and files

--force-lock

    Force acquisition of deployment lock.

Global flags

Examples

The following command deletes all previously-deployed resources and artifacts that are defined in the bundle configuration files:

Bash
databricks bundle destroy

databricks bundle generate

Generate bundle configuration for a resource that already exists in your Databricks workspace. The following resources are supported: app, dashboard, job, pipeline.

By default, this command generates a *.yml file for the resource in the resources folder of the bundle project and also downloads any files, such as notebooks, referenced in the configuration.

important

The bundle generate command is provided as a convenience to autogenerate resource configuration. However, when resource configuration is included in the bundle and deployed, it creates a new resource and does not update the existing resource unless bundle deployment bind has first been used. See databricks bundle deployment bind.

databricks bundle generate [command]

Available Commands

  • app - Generate bundle configuration for a Databricks app.
  • dashboard - Generate configuration for a dashboard.
  • job - Generate bundle configuration for a job.
  • pipeline - Generate bundle configuration for a pipeline.

Options

--key string

    Resource key to use for the generated configuration

Global flags

databricks bundle generate app

Generate bundle configuration for an existing Databricks app in the workspace.

databricks bundle generate app [flags]

Options

-d, --config-dir string

    Directory path where the output bundle config will be stored (default "resources")

--existing-app-name string

    App name to generate config for

-f, --force

    Force overwrite existing files in the output directory

-s, --source-dir string

    Directory path where the app files will be stored (default "src/app")

Global flags

Examples

The following example generates configuration for an existing app named my-app. You can get the app name from the Compute > Apps tab of the workspace UI.

Bash
databricks bundle generate app --existing-app-name my-app

The following command generates a new hello_world.app.yml file in the resources bundle project folder, and downloads the app's code files, such as the app's command configuration file app.yaml and main app.py. By default, the code files are copied to the bundle's src folder.

Bash
databricks bundle generate app --existing-app-name "hello_world"
YAML
# This is the contents of the resulting /resources/hello-world.app.yml file.
resources:
apps:
hello_world:
name: hello-world
description: A basic starter application.
source_code_path: ../src/app

databricks bundle generate dashboard

Generate configuration for an existing dashboard in the workspace.

databricks bundle generate dashboard [flags]
tip

To update the .lvdash.json file after you have already deployed a dashboard, use the --resource option when you run bundle generate dashboard to generate that file for the existing dashboard resource. To continuously poll and retrieve updates to a dashboard, use the --force and --watch options.

Options

-s, --dashboard-dir string

    Directory to write the dashboard representation to (default "src")

--existing-id string

    ID of the dashboard to generate configuration for

--existing-path string

    Workspace path of the dashboard to generate configuration for

-f, --force

    Force overwrite existing files in the output directory

--resource string

    Resource key of dashboard to watch for changes

-d, --resource-dir string

    Directory to write the configuration to (default "resources")

--watch

    Watch for changes to the dashboard and update the configuration

Global flags

Examples

The following example generates configuration by an existing dashboard ID:

Bash
databricks bundle generate dashboard --existing-id abc123

You can also generate configuration for an existing dashboard by workspace path. Copy the workspace path for a dashboard from the workspace UI.

For example, the following command generates a new baby_gender_by_county.dashboard.yml file in the resources bundle project folder containing the YAML below, and downloads the baby_gender_by_county.lvdash.json file to the src project folder.

Bash
databricks bundle generate dashboard --existing-path "/Workspace/Users/someone@example.com/baby_gender_by_county.lvdash.json"
YAML
# This is the contents of the resulting baby_gender_by_county.dashboard.yml file.
resources:
dashboards:
baby_gender_by_county:
display_name: 'Baby gender by county'
warehouse_id: aae11o8e6fe9zz79
file_path: ../src/baby_gender_by_county.lvdash.json

databricks bundle generate job

Generate bundle configuration for a job.

note

Currently, only jobs with notebook tasks are supported by this command.

databricks bundle generate job [flags]

Options

-d, --config-dir string

    Dir path where the output config will be stored (default "resources")

--existing-job-id int

    Job ID of the job to generate config for

-f, --force

    Force overwrite existing files in the output directory

-s, --source-dir string

    Dir path where the downloaded files will be stored (default "src")

Global flags

Examples

The following example generates a new hello_job.yml file in the resources bundle project folder containing the YAML below, and downloads the simple_notebook.py to the src project folder.

Bash
databricks bundle generate job --existing-job-id 6565621249
YAML
# This is the contents of the resulting hello_job.yml file.
resources:
jobs:
hello_job:
name: 'Hello Job'
tasks:
- task_key: run_notebook
email_notifications: {}
notebook_task:
notebook_path: ../src/simple_notebook.py
source: WORKSPACE
run_if: ALL_SUCCESS
max_concurrent_runs: 1

databricks bundle generate pipeline

Generate bundle configuration for a pipeline.

databricks bundle generate pipeline [flags]

Options

-d, --config-dir string

    Dir path where the output config will be stored (default "resources")

--existing-pipeline-id string

    ID of the pipeline to generate config for

-f, --force

    Force overwrite existing files in the output directory

-s, --source-dir string

    Dir path where the downloaded files will be stored (default "src")

Global flags

Examples

The following example generates configuration for an existing pipeline:

Bash
databricks bundle generate pipeline --existing-pipeline-id abc-123-def

databricks bundle init

Initialize a new bundle using a bundle template. Templates can be configured to prompt the user for values. See Databricks Asset Bundle project templates.

databricks bundle init [TEMPLATE_PATH] [flags]

Arguments

TEMPLATE_PATH

    Template to use for initialization (optional)

Options

--branch string

    Git branch to use for template initialization

--config-file string

    JSON file containing key value pairs of input parameters required for template initialization.

--output-dir string

    Directory to write the initialized template to.

--tag string

    Git tag to use for template initialization

--template-dir string

    Directory path within a Git repository containing the template.

Global flags

Examples

The following example prompts with a list of default bundle templates from which to choose:

Bash
databricks bundle init

The following example initializes a bundle using the default Python template:

Bash
databricks bundle init default-python

To create a Databricks Asset Bundle using a custom Databricks Asset Bundle template, specify the custom template path:

Bash
databricks bundle init <project-template-local-path-or-url> \
--project-dir="</local/path/to/project/template/output>"

The following example initializes a bundle from a Git repository:

Bash
databricks bundle init https://github.com/my/repository

The following example initializes with a specific branch:

Bash
databricks bundle init --branch main

databricks bundle open

Navigate to a bundle resource in the workspace, specifying the resource to open. If a resource key is not specified, this command outputs a list of the bundle's resources from which to choose.

databricks bundle open [flags]

Options

--force-pull

    Skip local cache and load the state from the remote workspace

Global flags

Examples

The following example launches a browser and navigates to the baby_gender_by_county dashboard in the bundle in the Databricks workspace that is configured for the bundle:

Bash
databricks bundle open baby_gender_by_county

databricks bundle run

Run a job, pipeline, or script. If you don't specify a resource, the command prompts with defined jobs, pipelines, and scripts from which to choose. Alternatively, specify the job or pipeline key or script name declared within the bundle configuration files.

databricks bundle run [flags] [KEY]

Validate a pipeline

If you want to do a pipeline validation run, use the --validate-only option, as shown in the following example:

Bash
databricks bundle run --validate-only my_pipeline

Pass job parameters

To pass job parameters, use the --params option, followed by comma-separated key-value pairs, where the key is the parameter name. For example, the following command sets the parameter with the name message to HelloWorld for the job hello_job:

Bash
databricks bundle run --params message=HelloWorld hello_job
note

As shown in the following examples, you can pass parameters to job tasks using the job task options, but the --params option is the recommended method for passing job parameters. An error occurs if job parameters are specified for a job that doesn't have job parameters defined or if task parameters are specified for a job that has job parameters defined.

You can also specify keyword or positional arguments. If the specified job uses job parameters or the job has a notebook task with parameters, flag names are mapped to the parameter names:

Bash
databricks bundle run hello_job -- --key1 value1 --key2 value2

Or if the specified job does not use job parameters and the job has a Python file task or a Python wheel task:

Bash
databricks bundle run my_job -- value1 value2 value3

Execute scripts

To execute scripts such as integration tests with a bundle's configured authentication credentials, you can either run scripts inline or run a script defined in the bundle configuration. Scripts are run using the same authentication context configured in the bundle.

  • Append a double hyphen (--) after bundle run to run scripts inline. For example, the following command outputs the current user's current working directory:

    Bash
    databricks bundle run -- python3 -c 'import os; print(os.getcwd())'
  • Alternatively, define a script within the scripts mapping in your bundle configuration, then use bundle run to run the script:

    YAML
    scripts:
    my_script:
    content: python3 -c 'import os; print(os.getcwd())'
    Bash
    databricks bundle run my_script

    For more information about scripts configuration, see scripts and scripts.

Bundle authentication information is passed to child processes using environment variables. See Databricks client unified authentication.

Arguments

KEY

    The unique identifier of the resource to run (optional)

Options

--no-wait

    Don't wait for the run to complete.

--restart

    Restart the run if it is already running.

Global flags

Job Flags

The following flags are job-level parameter flags. See Configure job parameters.

--params stringToString

    comma separated k=v pairs for job parameters (default [])

Job Task Flags

The following flags are task-level parameter flags. See Configure task parameters. Databricks recommends using job-level parameters (--params) over task-level parameters.

--dbt-commands strings

    A list of commands to execute for jobs with DBT tasks.

--jar-params strings

    A list of parameters for jobs with Spark JAR tasks.

--notebook-params stringToString

    A map from keys to values for jobs with notebook tasks. (default [])

--pipeline-params stringToString

    A map from keys to values for jobs with pipeline tasks. (default [])

--python-named-params stringToString

    A map from keys to values for jobs with Python wheel tasks. (default [])

--python-params strings

    A list of parameters for jobs with Python tasks.

--spark-submit-params strings

    A list of parameters for jobs with Spark submit tasks.

--sql-params stringToString

    A map from keys to values for jobs with SQL tasks. (default [])

Pipeline Flags

The following flags are pipeline flags.

--full-refresh strings

    List of tables to reset and recompute.

--full-refresh-all

    Perform a full graph reset and recompute.

--refresh strings

    List of tables to update.

--refresh-all

    Perform a full graph update.

--validate-only

    Perform an update to validate graph correctness.

Examples

The following example runs a job hello_job in the default target:

Bash
databricks bundle run hello_job

The following example runs a job hello_job within the context of a target declared with the name dev:

Bash
databricks bundle run -t dev hello_job

The following example cancels and restarts an existing job run:

Bash
databricks bundle run --restart hello_job

The following example runs a pipeline with full refresh:

Bash
databricks bundle run my_pipeline --full-refresh-all

The following example executes a command in the bundle context:

Bash
databricks bundle run -- echo "hello, world"

databricks bundle schema

Display JSON Schema for the bundle configuration.

databricks bundle schema [flags]

Options

Global flags

Examples

The following example outputs the JSON schema for the bundle configuration:

Bash
databricks bundle schema

To output the bundle configuration schema as a JSON file, run the bundle schema command and redirect the output to a JSON file. For example, you can generate a file named bundle_config_schema.json within the current directory:

Bash
databricks bundle schema > bundle_config_schema.json

databricks bundle summary

Output a summary of a bundle's identity and resources, including deep links for resources so that you can easily navigate to the resource in the Databricks workspace.

databricks bundle summary [flags]
tip

You can also use bundle open to navigate to a resource in the Databricks workspace. See databricks bundle open.

Options

--force-pull

    Skip local cache and load the state from the remote workspace

Global flags

Examples

The following example outputs a summary of a bundle's deployed resources:

Bash
databricks bundle summary

The following output is the summary of a bundle named my_pipeline_bundle that defines a job and a pipeline:

Name: my_pipeline_bundle
Target: dev
Workspace:
Host: https://myworkspace.cloud.databricks.com
User: someone@example.com
Path: /Users/someone@example.com/.bundle/my_pipeline/dev
Resources:
Jobs:
my_project_job:
Name: [dev someone] my_project_job
URL: https://myworkspace.cloud.databricks.com/jobs/206000809187888?o=6051000018419999
Pipelines:
my_project_pipeline:
Name: [dev someone] my_project_pipeline
URL: https://myworkspace.cloud.databricks.com/pipelines/7f559fd5-zztz-47fa-aa5c-c6bf034b4f58?o=6051000018419999

databricks bundle sync

Perform a one-way synchronization of a bundle's file changes within a local filesystem directory, to a directory within a remote Databricks workspace.

note

bundle sync commands cannot synchronize file changes from a directory within a remote Databricks workspace, back to a directory within a local filesystem.

databricks bundle sync [flags]

databricks bundle sync commands work in the same way as databricks sync commands and are provided as a productivity convenience. For command usage information, see sync command.

Options

--dry-run

    Simulate sync execution without making actual changes

--full

    Perform full synchronization (default is incremental)

--interval duration

    File system polling interval (for --watch) (default 1s)

--output type

    Type of the output format

--watch

    Watch local file system for changes

Global flags

Examples

The following example performs a dry run sync:

Bash
databricks bundle sync --dry-run

The following example watches for changes and syncs automatically:

Bash
databricks bundle sync --watch

The following example performs a full synchronization:

Bash
databricks bundle sync --full

databricks bundle validate

Validate bundle configuration files are syntactically correct.

databricks bundle validate [flags]

By default this command returns a summary of the bundle identity:

Output
Name: MyBundle
Target: dev
Workspace:
Host: https://my-host.cloud.databricks.com
User: someone@example.com
Path: /Users/someone@example.com/.bundle/MyBundle/dev

Validation OK!
note

The bundle validate command outputs warnings if resource properties are defined in the bundle configuration files that are not found in the corresponding object's schema.

If you only want to output a summary of the bundle's identity and resources, use bundle summary.

Options

Global flags

Examples

The following example validates the bundle configuration:

Bash
databricks bundle validate

Global flags

--debug

  Whether to enable debug logging.

-h or --help

    Display help for the Databricks CLI or the related command group or the related command.

--log-file string

    A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format format

    The log format type, text or json. The default value is text.

--log-level string

    A string representing the log format level. If not specified then the log format level is disabled.

-o, --output type

    The command output type, text or json. The default value is text.

-p, --profile string

    The name of the profile in the ~/.databrickscfg file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT is used.

--progress-format format

    The format to display progress logs: default, append, inplace, or json

-t, --target string

    If applicable, the bundle target to use

--var strings

    set values for variables defined in bundle config. Example: --var="foo=bar"