Configure classic compute for Lakeflow Jobs
Classic jobs require you to create and configure classic compute resources that fit the needs of your data transformation scenarios.
Databricks recommends serverless compute for most job workloads. Serverless compute manages all infrastructure and eliminates the need for specific compute configuration. See Run your Lakeflow Jobs with serverless compute for workflows.
If your workload is not supported on serverless, use the general classic compute best practices described in Classic compute configuration best practices, and review the jobs-specific guidance on this page.
Structured Streaming workflows have specific configuration recommendations. See Production considerations for Structured Streaming.
Use jobs compute, not all-purpose compute
Databricks recommends against using all-purpose compute for jobs for the following reasons:
- Databricks bills for all-purpose compute at a different rate than jobs compute.
- Jobs compute terminates automatically after a job run is complete. All-purpose compute supports auto-termination, which is tied to inactivity rather than the end of a job run.
- All-purpose compute is often shared across teams of users. Jobs scheduled against all-purpose compute often have increased latency due to competition for compute resources.
- Many recommendations for optimizing jobs compute configuration are not appropriate for the type of ad-hoc queries and interactive workloads run on all-purpose compute.
Limited exceptions
The following are use cases in which you might choose to use all-purpose compute for jobs:
- You are iteratively developing or testing new jobs. Start-up times for jobs compute can make iterative development tedious. All-purpose compute allows you to apply changes and run your job quickly.
- You have short-lived jobs that must be run frequently or on a specific schedule. There is no start-up time associated with the currently running all-purpose compute. Consider costs associated with idle time if using this pattern.
Serverless compute for jobs is the recommended substitute for most task types you might consider running against all-purpose compute.
Jobs-specific compute policies
Databricks recommends that workspace admins define compute policies for jobs and enforce these policies for all users who configure jobs.
Databricks provides a default policy configured for jobs. Admins can make this policy available to other workspace users. See Job Compute.