Databricks Runtime for Machine Learning

This article describes Databricks Runtime for Machine Learning and provides guidance for how to create a cluster that uses it.

What is Databricks Runtime for Machine Learning?

Databricks Runtime for Machine Learning (Databricks Runtime ML) automates the creation of a cluster with pre-built machine learning and deep learning infrastructure including the most common ML and DL libraries.

Libraries included in Databricks Runtime ML

Databricks Runtime ML includes a variety of popular ML libraries. The libraries are updated with each release to include new features and fixes.

Databricks has designated a subset of the supported libraries as top-tier libraries. For these libraries, Databricks provides a faster update cadence, updating to the latest package releases with each runtime release (barring dependency conflicts). Databricks also provides advanced support, testing, and embedded optimizations for top-tier libraries. Top-tier libraries are added or removed only with major releases.

For a full list of top-tier and other provided libraries, see the release notes for Databricks Runtime ML.

You can install additional libraries to create a custom environment for your notebook or cluster.

Set up compute resources for Databricks Runtime ML

The process for creating compute based on Databricks Runtime ML depends on whether your workspace is enabled for the Dedicated group cluster Public Preview or not. Workspaces that are enabled for the preview have a new simplified compute UI.

Create a cluster using Databricks Runtime ML

When you create a cluster, select a Databricks Runtime ML version from the Databricks runtime version drop-down menu. Both CPU and GPU-enabled ML runtimes are available.

Select Databricks Runtime ML

If you select a cluster from the drop-down menu in the notebook, the Databricks Runtime version appears at the right of the cluster name:

View Databricks Runtime ML version

If you select a GPU-enabled ML runtime, you are prompted to select a compatible Driver type and Worker type. Incompatible instance types are grayed out in the drop-down menu. GPU-enabled instance types are listed under the GPU accelerated label. For information about creating Databricks GPU clusters, see GPU-enabled compute. Databricks Runtime ML includes GPU hardware drivers and NVIDIA libraries such as CUDA.

Create a new cluster with the new simplified compute UI

Use the steps in this section only if your workspace is enabled for the Dedicated group cluster preview.

To use the machine learning version of Databricks Runtime, select the Machine learning checkbox.

MLR selection of compute UI

For GPU-based compute, select a GPU-enabled instance type. For the complete list of supported GPU types, see Supported instance types.

Photon and Databricks Runtime ML

When you create a CPU cluster running Databricks Runtime 15.2 ML or above, you can choose to enable Photon. Photon improves performance for applications using Spark SQL, Spark DataFrames, feature engineering, GraphFrames, and xgboost4j. It is not expected to improve performance on applications using Spark RDDs, Pandas UDFs, and non-JVM languages such as Python. Thus, Python packages such as XGBoost, PyTorch, and TensorFlow will not see an improvement with Photon.

Spark RDD APIs and Spark MLlib have limited compatibility with Photon. When processing large datasets using Spark RDD or Spark MLlib, you may experience Spark memory issues. See Spark memory issues.

Databricks Runtime ML on AWS Graviton instances

Databricks Runtime 15.4 LTS ML and above support Graviton instance types. Using Graviton instance types can improve performance for Spark, Photon, feature engineering, machine learning libraries such as XGBoost and LightGBM, and Spark MLlib algorithms for gradient boosting. Graviton instances may also provide better price-to-performance value than other AWS EC2 instance types.

Access mode for Databricks Runtime ML clusters

To access data in Unity Catalog on a cluster running Databricks Runtime ML, you must do one of the following:

  • Set up the cluster using Single user access mode.

  • Set up the cluster using Dedicated access mode. Dedicated access mode is currently in Public Preview. Dedicated access mode provides the features of Shared access mode on Databricks Runtime ML.

When a compute resource has Dedicated access, the resource can be assigned to a single user or a group. When assigned to a group (a group cluster), the user’s permissions automatically down-scopes to the group’s permissions, allowing the user to securely share the resource with other members of the group.

When using Single user access mode, the following features are only available on Databricks Runtime 15.4 LTS ML and above: