Orchestrate data processing workflows on Databricks

Databricks provides a comprehensive suite of tools and integrations to support your data processing workflows.

Data processing or analysis workflows with Databricks Jobs

You can use a Databricks job to run a data processing or data analysis task in a Databricks cluster with scalable resources. Your job can consist of a single task or can be a large, multi-task workflow with complex dependencies. Databricks manages the task orchestration, cluster management, monitoring, and error reporting for all of your jobs. You can run your jobs immediately or periodically through an easy-to-use scheduling system. You can implement job tasks using notebooks, JARS, Delta Live Tables pipelines, or Python, Scala, Spark submit, and Java applications.

You create jobs through the Jobs UI, the Jobs API, or the Databricks CLI. The Jobs UI allows you to monitor, test, and troubleshoot your running and completed jobs.

To get started:

  • Create your first Databricks jobs workflow with the quickstart.

  • Learn how to create, view, and run workflows with the Databricks jobs user interface.

  • Learn about Jobs API updates to support creating and managing workflows with Databricks jobs.

Transform your data with Delta Live Tables

Delta Live Tables is a framework for building reliable, maintainable, and testable data processing pipelines. You define the transformations to perform on your data, and Delta Live Tables manages task orchestration, cluster management, monitoring, data quality, and error handling. You can build your entire data processing workflow with a Delta Live Tables pipeline, or you can integrate your pipeline into a Databricks jobs workflow to orchestrate a complex data processing workflow.

To get started, see the Delta Live Tables introduction.