Bundle configuration examples
This article provides example configuration for Databricks Asset Bundles features and common bundle use cases.
Complete bundle examples, outlined in the following table, are available in the bundle-examples GitHub repository:
Bundle name  | Description  | 
|---|---|
A bundle with a Databricks app backed by an OLTP Postgres database  | |
A bundle with an AI/BI dashboard and a job that captures a snapshot of the dashboard and emails it to a subscriber  | |
A bundle that defines an OLTP database instance and a database catalog  | |
A bundle that defines a Databricks App  | |
A bundle that defines and uses a development (all-purpose) cluster  | |
A bundle that defines a secret scope and a job with a task that reads from it  | |
A bundle that defines and uses a job with multiple wheel dependencies  | |
A bundle with multiple jobs with run job tasks  | |
A bundle with a job that uses a SQL notebook task  | |
A bundle that defines a Unity Catalog schema and a pipeline that uses it  | |
A bundle that uses a private wheel package from a job  | |
A bundle that builds a   | |
A bundle that uses serverless compute to run a job  | |
A bundle that includes files located outside the bundle root directory.  | |
A bundle that defines and uses a Spark JAR task  | |
A bundle that writes a file to a Unity Catalog volume  | 
Bundle scenarios
This section contains configuration examples that demonstrate using top-level bundle mappings. See Configuration reference.
Bundle that uploads a JAR file to Unity Catalog
You can specify Unity Catalog volumes as an artifact path so that all artifacts, such as JAR files and wheel files, are uploaded to Unity Catalog volumes. The following example bundle builds and uploads a JAR file to Unity Catalog. For information on the artifact_path mapping, see artifact_path. For information on artifacts, see artifacts.
bundle:
  name: jar-bundle
workspace:
  host: https://myworkspace.cloud.databricks.com
  artifact_path: /Volumes/main/default/my_volume
artifacts:
  my_java_code:
    path: ./sample-java
    build: 'javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class'
    files:
      - source: ./sample-java/PrintArgs.jar
resources:
  jobs:
    jar_job:
      name: 'Spark Jar Job'
      tasks:
        - task_key: SparkJarTask
          new_cluster:
            num_workers: 1
            spark_version: '14.3.x-scala2.12'
            node_type_id: 'i3.xlarge'
          spark_jar_task:
            main_class_name: PrintArgs
          libraries:
            - jar: ./sample-java/PrintArgs.jar
Job configuration
This section contains job configuration examples. For job configuration details, see job.
Job that uses serverless compute
Databricks Asset Bundles support jobs that run on serverless compute. See Run your Lakeflow Jobs with serverless compute for workflows. To configure this, you can either omit the clusters setting for a job with a notebook task, or you can specify an environment as shown in the examples below. For Python script, Python wheel, and dbt tasks, environment_key is required for serverless compute. See environment_key.
# A serverless job (no cluster definition)
resources:
  jobs:
    serverless_job_no_cluster:
      name: serverless_job_no_cluster
      email_notifications:
        on_failure:
          - someone@example.com
      tasks:
        - task_key: notebook_task
          notebook_task:
            notebook_path: ../src/notebook.ipynb
# A serverless job (environment spec)
resources:
  jobs:
    serverless_job_environment:
      name: serverless_job_environment
      tasks:
        - task_key: task
          spark_python_task:
            python_file: ../src/main.py
          # The key that references an environment spec in a job.
          # https://docs.databricks.com/api/workspace/jobs/create#tasks-environment_key
          environment_key: default
      # A list of task execution environment specifications that can be referenced by tasks of this job.
      environments:
        - environment_key: default
          # Full documentation of this spec can be found at:
          # https://docs.databricks.com/api/workspace/jobs/create#environments-spec
          spec:
            environment_version: '2'
            dependencies:
              - my-library
Job with multiple wheel files
The following example configurations defines a bundle that contains a job with multiple *.whl files.
# job.yml
resources:
  jobs:
    example_job:
      name: 'Example with multiple wheels'
      tasks:
        - task_key: task
          spark_python_task:
            python_file: ../src/call_wheel.py
          libraries:
            - whl: ../my_custom_wheel1/dist/*.whl
            - whl: ../my_custom_wheel2/dist/*.whl
          new_cluster:
            node_type_id: i3.xlarge
            num_workers: 0
            spark_version: 14.3.x-scala2.12
            spark_conf:
              'spark.databricks.cluster.profile': 'singleNode'
              'spark.master': 'local[*, 4]'
            custom_tags:
              'ResourceClass': 'SingleNode'
# databricks.yml
bundle:
  name: job_with_multiple_wheels
include:
  - ./resources/job.yml
workspace:
  host: https://myworkspace.cloud.databricks.com
artifacts:
  my_custom_wheel1:
    type: whl
    build: poetry build
    path: ./my_custom_wheel1
  my_custom_wheel2:
    type: whl
    build: poetry build
    path: ./my_custom_wheel2
targets:
  dev:
    default: true
    mode: development
Job with parameters
The following example configuration defines a job with parameters. For more information about parameterizing jobs, see Parameterize jobs.
resources:
  jobs:
    job_with_parameters:
      name: job_with_parameters
      tasks:
        - task_key: task_a
          spark_python_task:
            python_file: ../src/file.py
            parameters:
              - '--param1={{ job.parameters.param1 }}'
              - '--param2={{ job.parameters.param2 }}'
          new_cluster:
            node_type_id: i3.xlarge
            num_workers: 1
            spark_version: 14.3.x-scala2.12
      parameters:
        - name: param1
          default: value1
        - name: param2
          default: value1
These parameters can be set at runtime by passing job parameters to bundle run, for example:
databricks bundle run -- --param1=value2 --param2=value2
Job that uses a requirements.txt file
The following example configuration defines a job that uses a requirements.txt file.
resources:
  jobs:
    job_with_requirements_txt:
      name: 'Example job that uses a requirements.txt file'
      tasks:
        - task_key: task
          job_cluster_key: default
          spark_python_task:
            python_file: ../src/main.py
          libraries:
            - requirements: /Workspace/${workspace.file_path}/requirements.txt
Job on a schedule
The following examples show configuration for jobs that run on a schedule. For information about job schedules and triggers, see Automating jobs with schedules and triggers.
This configuration defines a job that runs daily at a specified time:
resources:
  jobs:
    my-notebook-job:
      name: my-notebook-job
      tasks:
        - task_key: my-notebook-task
          notebook_task:
            notebook_path: ./my-notebook.ipynb
      schedule:
        quartz_cron_expression: '0 0 8 * * ?' # daily at 8am
        timezone_id: UTC
        pause_status: UNPAUSED
In this configuration, the job runs one week after the job was last run:
resources:
  jobs:
    my-notebook-job:
      name: my-notebook-job
      tasks:
        - task_key: my-notebook-task
          notebook_task:
            notebook_path: ./my-notebook.ipynb
      trigger:
        pause_status: UNPAUSED
        periodic:
          interval: 1
          unit: WEEKS
Pipeline configuration
This section contains pipeline configuration examples. For pipeline configuration information, see pipeline.
Pipeline that uses serverless compute
Databricks Asset Bundles support pipelines that run on serverless compute. To configure this, set the pipeline serverless setting to true. The following example configuration defines a pipeline that runs on serverless compute with dependencies installed, and a job that triggers a refresh of the pipeline every hour.
# A pipeline that runs on serverless compute
resources:
  pipelines:
    my_pipeline:
      name: my_pipeline
      target: ${bundle.environment}
      serverless: true
      environment:
        dependencies:
          - 'dist/*.whl'
      catalog: users
      libraries:
        - notebook:
            path: ../src/my_pipeline.ipynb
      configuration:
        bundle.sourcePath: /Workspace/${workspace.file_path}/src
# This defines a job to refresh a pipeline that is triggered every hour
resources:
  jobs:
    my_job:
      name: my_job
      # Run this job once an hour.
      trigger:
        periodic:
          interval: 1
          unit: HOURS
      email_notifications:
        on_failure:
          - someone@example.com
      tasks:
        - task_key: refresh_pipeline
          pipeline_task:
            pipeline_id: ${resources.pipelines.my_pipeline.id}