Databricks runtimes are the set of core components that run on Databricks clusters. Databricks offers several types of runtimes:
- Databricks Runtime
- Includes Apache Spark but also adds a number of components and updates that substantially improve the usability, performance, and security of big data analytics.
- Databricks Runtime with Conda
- An experimental version of Databricks Runtime based on Conda. Databricks Runtime with Conda provides an updated and optimized list of default packages and a flexible Python environment for advanced users who require maximum control over packages and environments.
- Databricks Runtime for Machine Learning
- Built on Databricks Runtime and provides a ready-to-go environment for machine learning and data science. It contains multiple popular libraries, including TensorFlow, Keras, PyTorch, and XGBoost.
- Databricks Runtime for Health and Life Sciences
- A version of Databricks Runtime optimized for working with genomic and biomedical data.
- Databricks Light
- The Databricks packaging of the open source Apache Spark runtime. It provides a runtime option for jobs that don’t need the advanced performance, reliability, or autoscaling benefits provided by Databricks Runtime. You can select Databricks Light only when you create a cluster to run a JAR, Python, or spark-submit job; you cannot select this runtime for clusters on which you run interactive or notebook job workloads.
You can choose from among many supported runtime versions when you create a cluster.
For details on each runtime type, see:
For information about the contents of each runtime version, see the release notes.