Databricks offers three “compute” types, each designed for a different type of workload:
- Jobs Light compute: Run Databricks jobs on Jobs Light clusters with the open source Spark runtime on the Databricks platform.
- Jobs compute: Run Databricks jobs on Jobs clusters with Databricks’ optimized runtime for massive performance and scalability improvement.
- All-purpose compute: Run any workloads on All-purpose clusters, including interactive data science and analysis, BI workloads via JDBC/ODBC, MLflow experiments, Databricks jobs, and so on.
This table shows the features available with each compute type.
|Feature||Jobs Light compute||Jobs compute||All-purpose compute|
|Managed Apache Spark||X||X||X|
|Apache Spark clusters for running production jobs on the Databricks platform, with alerting and retries.|
|Job scheduling with libraries||X||X||X|
|Easy to run production jobs including streaming with monitoring and a scheduler for running libraries.|
|Job scheduling with notebooks||X||X|
|Ability to schedule jobs using Scala, Python, R, and SQL notebooks and notebook workflows.|
|Easy to manage and cost-effective clusters, with autoscaling of compute and instance storage, automatic start and termination of clusters.|
|Databricks Runtime for ML||X||X|
|Out-of-the-box ML frameworks, including Spark/Horovod integration; XGBoost, TensorFlow, PyTorch, and Keras support.|
|Run MLflow on the Databricks platform to simplify the end-to-end ML lifecycle, with MLflow remote execution and a managed tracking server. You can even run MLflow from outside of Databricks (usage may be subject to a limit).|
|Delta Lake with Delta Engine||X||X|
|Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale. Delta Lake on Databricks provides ACID transactions, schema management, batch/stream read/write support, and data versioning, along with Delta Engine’s performance optimizations.|
|High-concurrency mode for multiple users and persistent clusters for analytics.|
|Notebooks and collaboration||X|
|Enable highly collaborative and productive work among analysts and with other colleagues using Scala, Python, SQL, and R notebooks that provide one-click visualization, interactive dashboards, revision history, and version control integration (Github, Bitbucket).|
|RStudio® integration and a range of third party BI tools through JDBC/ODBC.|
For information on pricing by compute type, see AWS Pricing.