Skip to main content

Migrate existing resources to a bundle

When building your bundle, you may want to include Databricks resources that already exist and are fully configured in the remote workspace. You can use the Databricks CLI bundle generate command to quickly autogenerate configuration in your bundle for existing apps, dashboards, jobs, and pipelines. See Generate a bundle configuration file. Configuration that you can copy and manually paste into bundle resource configuration files is available in the Databricks UI for some resources, such as jobs and pipelines.

After you have generated configuration for a resource in your bundle and deployed the bundle, use the bundle deployment bind command to bind a resource in your bundle to the corresponding resource in the workspace. See Bind a bundle resource.

This page provides simple examples that use the Databricks CLI or UI to generate or retrieve bundle resource configuration.

For details about resource definitions in bundles, see Databricks Asset Bundles resources.

Generate an existing job or pipeline configuration using the Databricks CLI

To programmatically generate bundle configuration for an existing job or pipeline:

  1. Retrieve the ID of the existing job or pipeline from the Job details or Pipeline details side panel for the job or pipeline in the UI. Alternatively, use the Databricks CLI databricks jobs list or databricks pipelines list-pipelines command.

  2. Run the bundle generate job or bundle generate pipeline Databricks CLI command, setting the job or pipeline ID:

    Bash
    databricks bundle generate job --existing-job-id 6565621249
    Bash
    databricks bundle generate pipeline --existing-pipeline-id 6565621249

    This command creates a bundle configuration file for the resource in the bundle’s resources folder and downloads any referenced artifacts to the src folder.

You can also generate configuration for an existing dashboard. See Generate dashboard configuration.

Retrieve an existing job definition using the UI

To retrieve the YAML representation of an existing job definition from the Databricks workspace UI:

  1. In your Databricks workspace’s sidebar, click Workflows.

  2. On the Jobs tab, click your job’s Name link.

  3. Next to the Run now button, click the kebab, and then click Edit as YAML.

  4. Copy the YAML and add it to your bundle’s databricks.yml file, or create a configuration file for your job in the resources directory of your bundle project and reference it from your databricks.yml file. See resources.

  5. Download and add any Python files and notebooks that are referenced in the existing job to the bundle’s project source. Typically bundle artifacts are located in the src directory in a bundle.

    tip

    You can export an existing notebook from a Databricks workspace into the .ipynb format by clicking File > Export > IPython Notebook from the Databricks notebook user interface.

    After you add your notebooks, Python files, and other artifacts to the bundle, make sure that your job definition properly references them. For example, for a notebook named hello.ipynb that is in the src directory of the bundle:

    YAML
    resources:
    jobs:
    hello-job:
    name: hello-job
    tasks:
    - task_key: hello-task
    notebook_task:
    notebook_path: ../src/hello.ipynb

For more information about viewing jobs as code in the UI, see View jobs as code.

Retrieve an existing pipeline definition using the UI

To retrieve the YAML representation of an existing pipeline definition from the Databricks workspace UI:

  1. In your Databricks workspace’s sidebar, click Workflows.

  2. On the DLT tab, click your pipeline’s Name link.

  3. Next to the Development button, click the kebab, and then click View settings YAML.

  4. Copy the pipeline definition’s YAML in the Pipeline settings YAML dialog to your local clipboard by clicking the copy icon.

  5. Add the YAML that you copied to your bundle’s databricks.yml file, or create a configuration file for your pipeline in the resources folder of your bundle project and reference it from your databricks.yml file. See resources.

  6. Download and add any Python files and notebooks that are referenced to the bundle’s project source. Typically bundle artifacts are located in the src directory in a bundle.

    tip

    You can export an existing notebook from a Databricks workspace into the .ipynb format by clicking File > Export > IPython Notebook from the Databricks notebook user interface.

    After you add your notebooks, Python files, and other artifacts to the bundle, make sure that your pipeline definition properly references them. For example, for a notebook named hello.ipynb that is in the src/ directory of the bundle:

    YAML
    resources:
    pipelines:
    hello-pipeline:
    name: hello-pipeline
    libraries:
    - notebook:
    path: ../src/hello.ipynb

Bind a resource to its remote counterpart

Typically after you have added a resource to your bundle, you will want to ensure the resource in your bundle and the existing resource in the workspace stay in sync. The bundle deployment bind command allows you to link them. If you bind a resource, the linked Databricks resource in the workspace is updated based on the configuration defined in the bundle on the next bundle deploy. For a list of resources that support bundle deployment bind, see Bind a bundle resource.

For example, the following command binds the resource hello_job to its remote counterpart in the workspace. It prompts with a confirmation to ensure that updates to the job configuration in the bundle should be applied to the corresponding remote job when the bundle is next deployed.

Bash
databricks bundle deployment bind hello_job 6565621249

To remove the link between a bundle resource and its counterpart in the workspace, use bundle deployment unbind. See Unbind a bundle resource.

Bash
databricks bundle deployment unbind 6565621249