Databricks Runtimes

Databricks runtimes are the set of core components that run on Databricks clusters. Databricks offers several types of runtimes:

Databricks Runtime

Includes Apache Spark but also adds a number of components and updates that substantially improve the usability, performance, and security of big data analytics.

Databricks Runtime with Conda

An experimental version of Databricks Runtime based on Conda. Databricks Runtime with Conda provides an updated and optimized list of default packages and a flexible Python environment for advanced users who require maximum control over packages and environments.

Databricks Runtime for Machine Learning

Built on Databricks Runtime and provides a ready-to-go environment for machine learning and data science. It contains multiple popular libraries, including TensorFlow, Keras, PyTorch, and XGBoost.

Databricks Runtime for Genomics

A version of Databricks Runtime optimized for working with genomic and biomedical data.

Databricks Light

The Databricks packaging of the open source Apache Spark runtime. It provides a runtime option for jobs that don’t need the advanced performance, reliability, or autoscaling benefits provided by Databricks Runtime. You can select Databricks Light only when you create a cluster to run a JAR, Python, or spark-submit job; you cannot select this runtime for clusters on which you run interactive or notebook job workloads.

You can choose from among many supported runtime versions when you create a cluster.

Select Databricks runtime

For details on each runtime type, see:

For information about the contents of each runtime version, see the release notes.