This guide discusses the orchestration of multiple tasks using Databricks jobs, a feature that is in Public Preview. For information about how to create, run, and manage single-task jobs using the generally-available jobs interface, see Jobs.
You can use a job to run a data processing or data analysis task in a Databricks cluster with scalable resources. Your job can consist of a single task or be a large, multi-task application with complex dependencies. Databricks manages the task orchestration, cluster management, monitoring, and error reporting for all of your jobs. You can run your jobs immediately or periodically through an easy-to-use scheduling system.
You can implement job tasks using notebooks, Delta Live Tables pipelines, or Python, Scala, and Java applications. A single job can consist of a Python script that ingests data from cloud storage, prepares the data with a Delta Live Tables pipeline, and creates a dashboard with a notebook.
You create jobs through the Jobs UI, the Jobs API, or the Databricks CLI. The Jobs UI allows you to monitor, test, and troubleshoot your running and completed jobs.
An administrator must enable support for jobs with multiple tasks in the Databricks admin console.
To get started:
- Create your first job orchestrating multiple tasks: Jobs quickstart.
- Learn about the features of Databricks jobs and how to create, view, and run jobs in the Creating and managing jobs with multiple tasks.
- Learn about Jobs API updates to support creating and managing jobs with multiple tasks.