Task 7: Perform job scheduling and orchestration

Estimated time to complete: 1.5 hours

Typically, you want to automate and streamline workflows on Databricks to increase productivity. By doing so, you can also achieve cost benefits by taking advantage of lower pricing for Databricks Jobs Compute versus ad-hoc and interactive All-purpose workloads.

In this task, we review two strategies for automating workflows:

  • Scheduling and managing jobs through the Databricks notebook UI and the Jobs API
  • Setting up CI/CD and workflow integrations

Create and manage jobs using the UI

Review the article Jobs to learn how to view, create, run, and pause jobs, as well as how to view job details.

Manage jobs using the Databricks Jobs REST API

Review the Jobs REST API reference and Create a High Concurrency cluster to learn how to create, edit, and delete jobs using the Jobs API.

Advanced job options

Sometimes you might need to set special configuration options for your jobs, such as setting the maximum number of runs that can run in parallel, configuring email alerts, or setting timeout and retry policies. To learn how, review this article: Jobs.

Run jobs using CI/CD and workflow tools

For information about running jobs using CI/CD and workflow tools, see Continuous integration and delivery on Databricks using Jenkins and Apache Airflow.

Complete onboarding