Databricks cluster compute types: feature comparison

Databricks offers five compute types, each designed for a different type of workload:

  • Jobs compute: Run data engineering and ML workloads on scalable and highly reliable compute. Strongly recommended for production workloads.

  • All-purpose compute: Designed for interactive data science and analysis workloads.

  • SQL Pro: Run BI and analytics workloads on compute designed for high-concurrency, low-latency workloads.

  • Serverless SQL: All the capabilities of SQL Pro, plus instant startup and autoscaling through an optimized compute fleet managed for you by Databricks.

  • SQL Classic: Run SQL queries for BI reporting, analytics, and visualization to get timely insights from data lakes.

Preview

The SQL Pro compute type is in Public Preview.

This table shows the features available with each compute type.

Feature

Jobs compute

All-purpose compute

SQL Pro/Serverless SQL

SQL Classic

Managed Apache Spark: Apache Spark clusters for running production jobs on the Databricks platform, with alerting and retries.

X

X

Job scheduling: Production jobs including streaming with monitoring, multi-step jobs, SQL, and a scheduler for running libraries.

X

X

X

Autopilot clusters: Cost-effective clusters with autoscaling of compute and instance storage, automatic start, usage of spot instances, photon, and termination of clusters.

X

X

Databricks Runtime for ML: Out-of-the-box ML frameworks, including Spark/Horovod integration, XGBoost, TensorFlow, PyTorch, Keras support, experiment tracking, hyperparameter optimization, glassbox ML with autoML, feature engineering, Databricks Feature Store, and MLflow Model Registry.

X

X

Managed MLflow: Run MLflow on the Databricks platform to simplify the end-to-end ML lifecycle with MLflow remote execution and a managed tracking server. You can also run MLflow from outside of Databricks (usage may be subject to a limit).

X

X

Delta Lake with Delta Engine: Robust pipelines serving clean, quality data, supporting high performance analytics at scale. Delta Lake on Databricks provides ACID transactions, schema management, batch or stream read and write support, and data versioning, along with Delta Engine’s performance optimizations.

X

X

X

X

Interactive clusters: High-concurrency mode for multiple users and persistent clusters for analytics.

X

Notebooks and collaboration: Enable highly collaborative and productive work among analysts and other colleagues using Scala, Python, SQL, and R notebooks that provide one-click visualization, interactive dashboards, data profiles, no-code data exploration, parameter widgets, experiment tracking, revision history, and version control integration using Git providers such as GitHub.

X

X

Ecosystem integrations: RStudio integration and a range of third party BI tools through JDBC/ODBC.

X

Business intelligence: High-performance, scalable, fully managed Photon execution engine for SQL queries through SQL Warehouses. Includes the option to optimize for cost or reliability, built-in query editor, query history, query profiles, reliable data caching, auto-termination, built-in dashboards, and alerts.

X

X

For information on pricing by compute type, see AWS Pricing.