Add tasks to jobs in Databricks asset bundles
Preview
This feature is in Public Preview.
This article provides examples of various types of tasks that you can add to Databricks jobs in Databricks asset bundles. See What are Databricks Asset Bundles?.
Notebook task
You use this task to run a notebook.
The following example adds a notebook task to a job. The path for the notebook to deploy is relative to the databricks.yml
file in which this task is declared. The task gets the notebook from its deployed location in the Databricks workspace. (Ellipses indicate omitted content, for brevity.)
# ...
resources:
jobs:
my-notebook-job:
name: my-notebook-job
# ...
tasks:
- task_key: my-notebook-task
notebook_task:
notebook_path: ./my-notebook.ipynb
# ...
# ...
For additional mappings that you can set for this task, see tasks > notebook_task
in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Notebook” in Task type options and Pass parameters to a Databricks job task.
Python script task
You use this task to run a Python file.
The following example adds a Python script task to a job. The path for the Python file to deploy is relative to the databricks.yml
file in which this task is declared. The task gets the Python file from its deployed location in the Databricks workspace. (Ellipses indicate omitted content, for brevity.)
# ...
resources:
jobs:
my-python-script-job:
name: my-python-script-job
# ...
tasks:
- task_key: my-python-script-task
spark_python_task:
python_file: ./my-script.py
# ...
# ...
For additional mappings that you can set for this task, see tasks > spark_python_task
in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Python script” in Task type options and Pass parameters to a Databricks job task.
Python wheel task
You use this task to run a Python wheel.
The following example adds a Python wheel task to a job. The path for the Python wheel to deploy is relative to the databricks.yml
file in which this task is declared. (Ellipses indicate omitted content, for brevity.)
# ...
resources:
jobs:
my-python-wheel-job:
name: my-python-wheel-job
# ...
tasks:
- task_key: my-python-wheel-task
python_wheel_task:
entry_point: run
package_name: my_package
libraries:
- whl: ./my_package/dist/my_package-*.whl
# ...
# ...
For additional mappings that you can set for this task, see tasks > python_wheel_task
in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also Develop a Python wheel by using Databricks Asset Bundles, and “Python wheel” in Task type options and Pass parameters to a Databricks job task.
JAR task
You use this task to run a JAR. JARS can be referenced only from Unity Catalog volumes or external cloud storage locations.
The following example adds a JAR task to a job. The path for the JAR is to the specified volume location. (Ellipses indicate omitted content, for brevity.)
# ...
resources:
jobs:
my-jar-job:
name: my-jar-job
# ...
tasks:
- task_key: my-jar-task
spark_jar_task:
main_class_name: org.example.com.Main
libraries:
- jar: /Volumes/main/default/my-volume/my-project-0.1.0-SNAPSHOT.jar
# ...
# ...
For additional mappings that you can set for this task, see tasks > spark_jar_task
in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “JAR” in Task type options and Pass parameters to a Databricks job task.
Delta Live Tables pipeline task
You use this task to run a Delta Live Tables pipeline. See What is Delta Live Tables?.
The following example adds a Delta Live Tables pipeline task to a job. This Delta Live Tables pipeline task runs the specified pipeline. (Ellipses indicate omitted content, for brevity.)
# ...
resources:
jobs:
my-pipeline-job:
name: my-pipeline-job
# ...
tasks:
- task_key: my-pipeline-task
pipeline_task:
pipeline_id: 11111111-1111-1111-1111-111111111111
# ...
# ...
You can find a pipelines’s ID by opening the pipeline in the workspace and copying the Pipeline ID value on the Pipeline details tab of the pipeline’s settings page.
For additional mappings that you can set for this task, see tasks > pipeline_task
in the create job operation’s request payload as defined in POST /api/2.1/jobs/create in the REST API reference, expressed in YAML format. See also “Delta Live Tables Pipeline” in Task type options