Add tasks to jobs in Databricks asset bundles

Preview

This feature is in Public Preview.

This article provides examples of various types of tasks that you can add to Databricks jobs in Databricks asset bundles. See What are Databricks Asset Bundles?.

Notebook task

You use this task to run a notebook.

The following example adds a notebook task to a job. The path for the notebook to deploy is relative to the databricks.yml file in which this task is declared. The task gets the notebook from its deployed location in the Databricks workspace. (Ellipses indicate omitted content, for brevity.)

# ...
resources:
  jobs:
    my-notebook-job:
      name: my-notebook-job
      # ...
      tasks:
        - task_key: my-notebook-task
          notebook_task:
            notebook_path: ./my-notebook.ipynb
          # ...
# ...

For additional mappings that you can set for this task, see tasks > notebook_task in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Notebook” in Task type options and Pass parameters to a Databricks job task.

Python script task

You use this task to run a Python file.

The following example adds a Python script task to a job. The path for the Python file to deploy is relative to the databricks.yml file in which this task is declared. The task gets the Python file from its deployed location in the Databricks workspace. (Ellipses indicate omitted content, for brevity.)

# ...
resources:
  jobs:
    my-python-script-job:
      name: my-python-script-job
      # ...
      tasks:
        - task_key: my-python-script-task
          spark_python_task:
            python_file: ./my-script.py
          # ...
# ...

For additional mappings that you can set for this task, see tasks > spark_python_task in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Python script” in Task type options and Pass parameters to a Databricks job task.

Python wheel task

You use this task to run a Python wheel.

The following example adds a Python wheel task to a job. The path for the Python wheel to deploy is relative to the databricks.yml file in which this task is declared. (Ellipses indicate omitted content, for brevity.)

# ...
resources:
  jobs:
    my-python-wheel-job:
      name: my-python-wheel-job
      # ...
      tasks:
        - task_key: my-python-wheel-task
          python_wheel_task:
            entry_point: run
            package_name: my_package
          libraries:
            - whl: ./my_package/dist/my_package-*.whl
          # ...
# ...

For additional mappings that you can set for this task, see tasks > python_wheel_task in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also Develop a Python wheel by using Databricks Asset Bundles, and “Python wheel” in Task type options and Pass parameters to a Databricks job task.

JAR task

You use this task to run a JAR. JARS can be referenced only from Unity Catalog volumes or external cloud storage locations.

The following example adds a JAR task to a job. The path for the JAR is to the specified volume location. (Ellipses indicate omitted content, for brevity.)

# ...
resources:
  jobs:
    my-jar-job:
      name: my-jar-job
      # ...
      tasks:
        - task_key: my-jar-task
          spark_jar_task:
            main_class_name: org.example.com.Main
          libraries:
            - jar: /Volumes/main/default/my-volume/my-project-0.1.0-SNAPSHOT.jar
          # ...
# ...

For additional mappings that you can set for this task, see tasks > spark_jar_task in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “JAR” in Task type options and Pass parameters to a Databricks job task.

Delta Live Tables pipeline task

You use this task to run a Delta Live Tables pipeline. See What is Delta Live Tables?.

The following example adds a Delta Live Tables pipeline task to a job. This Delta Live Tables pipeline task runs the specified pipeline. (Ellipses indicate omitted content, for brevity.)

# ...
resources:
  jobs:
    my-pipeline-job:
      name: my-pipeline-job
      # ...
      tasks:
        - task_key: my-pipeline-task
          pipeline_task:
            pipeline_id: 11111111-1111-1111-1111-111111111111
          # ...
# ...

You can find a pipelines’s ID by opening the pipeline in the workspace and copying the Pipeline ID value on the Pipeline details tab of the pipeline’s settings page.

For additional mappings that you can set for this task, see tasks > pipeline_task in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Delta Live Tables Pipeline” in Task type options