Databricks Runtime 11.1 for Machine Learning

Databricks Runtime 11.1 for Machine Learning provides a ready-to-go environment for machine learning and data science based on Databricks Runtime 11.1. Databricks Runtime ML contains many popular machine learning libraries, including TensorFlow, PyTorch, and XGBoost. Databricks Runtime ML includes AutoML, a tool to automatically train machine learning pipelines. Databricks Runtime ML also supports distributed deep learning training using Horovod.

For more information, including instructions for creating a Databricks Runtime ML cluster, see Databricks Runtime for Machine Learning.

New features and improvements

Databricks Runtime 11.1 ML is built on top of Databricks Runtime 11.1. For information on what’s new in Databricks Runtime 11.1, including Apache Spark MLlib and SparkR, see the Databricks Runtime 11.1 release notes.

Enhancements to Databricks AutoML

The following enhancements have been made to Databricks AutoML. For details, see Databricks AutoML.

  • When AutoML detects that a classification problem is binary, it calculates binary classification metrics and infers the positive class of the problem. You can also specify the positive class using a new pos_label parameter.

  • For forecasting problems, AutoML can now handle the scenario where the horizon is long relative to the time span of the training data.

Enhancements to Databricks Feature Store

The following enhancements have been made to Databricks Feature Store.

System environment

The system environment in Databricks Runtime 11.1 ML differs from Databricks Runtime 11.1 as follows:

Libraries

The following sections list the libraries included in Databricks Runtime 11.1 ML that differ from those included in Databricks Runtime 11.1.

Python libraries

Databricks Runtime 11.1 ML uses Virtualenv for Python package management and includes many popular ML packages.

In addition to the packages specified in the in the following sections, Databricks Runtime 11.1 ML also includes the following packages:

  • hyperopt 0.2.7.db1

  • sparkdl 2.2.0-db6

  • feature_store 0.5.0

  • automl 1.11.0

Python libraries on CPU clusters

Library

Version

Library

Version

Library

Version

absl-py

1.0.0

Antergos Linux

2015.10 (ISO-Rolling)

argon2-cffi

20.1.0

astor

0.8.1

astunparse

1.6.3

async-generator

1.10

attrs

21.2.0

azure-core

1.22.1

azure-cosmos

4.2.0

backcall

0.2.0

backports.entry-points-selectable

1.1.1

bcrypt

3.2.2

bleach

4.0.0

blis

0.7.8

boto3

1.21.18

botocore

1.24.18

cachetools

5.2.0

catalogue

2.0.7

certifi

2021.10.8

cffi

1.14.6

chardet

4.0.0

charset-normalizer

2.0.4

click

8.0.3

cloudpickle

2.0.0

cmdstanpy

0.9.68

configparser

5.2.0

convertdate

2.4.0

cryptography

3.4.8

cycler

0.10.0

cymem

2.0.6

Cython

0.29.24

databricks-automl-runtime

0.2.9.1

databricks-cli

0.16.8

dbl-tempo

0.1.12

dbus-python

1.2.16

debugpy

1.4.1

decorator

5.1.0

defusedxml

0.7.1

dill

0.3.4

diskcache

5.4.0

distlib

0.3.4

distro-info

0.23ubuntu1

entrypoints

0.3

ephem

4.1.3

facets-overview

1.0.0

fasttext

0.9.2

filelock

3.3.1

Flask

1.1.2

flatbuffers

1.12

fsspec

2021.8.1

future

0.18.2

gast

0.4.0

gitdb

4.0.9

GitPython

3.1.27

google-auth

2.6.0

google-auth-oauthlib

0.4.6

google-pasta

0.2.0

grpcio

1.44.0

gunicorn

20.1.0

gviz-api

1.10.0

h5py

3.3.0

hijri-converter

2.2.4

holidays

0.14.2

horovod

0.24.3

htmlmin

0.1.12

huggingface-hub

0.8.1

idna

3.2

ImageHash

4.2.1

imbalanced-learn

0.8.1

importlib-metadata

4.8.1

ipykernel

6.12.1

ipython

7.32.0

ipython-genutils

0.2.0

ipywidgets

7.7.0

isodate

0.6.1

itsdangerous

2.0.1

jedi

0.18.0

Jinja2

2.11.3

jmespath

0.10.0

joblib

1.0.1

joblibspark

0.5.0

jsonschema

3.2.0

jupyter-client

6.1.12

jupyter-core

4.8.1

jupyterlab-pygments

0.1.2

jupyterlab-widgets

1.0.0

keras

2.9.0

Keras-Preprocessing

1.1.2

kiwisolver

1.3.1

korean-lunar-calendar

0.2.1

langcodes

3.3.0

libclang

14.0.1

lightgbm

3.3.2

llvmlite

0.38.1

LunarCalendar

0.0.9

Mako

1.2.0

Markdown

3.3.6

MarkupSafe

2.0.1

matplotlib

3.4.3

matplotlib-inline

0.1.2

missingno

0.5.1

mistune

0.8.4

mleap

0.20.0

mlflow-skinny

1.27.0

multimethod

1.8

murmurhash

1.0.7

nbclient

0.5.3

nbconvert

6.1.0

nbformat

5.1.3

nest-asyncio

1.5.1

networkx

2.6.3

nltk

3.6.5

notebook

6.4.5

numba

0.55.2

numpy

1.20.3

oauthlib

3.2.0

opt-einsum

3.3.0

packaging

21.0

pandas

1.3.4

pandas-profiling

3.1.0

pandocfilters

1.4.3

paramiko

2.9.2

parso

0.8.2

pathy

0.6.2

patsy

0.5.2

petastorm

0.11.4

pexpect

4.8.0

phik

0.12.2

pickleshare

0.7.5

Pillow

8.4.0

pip

21.2.4

platformdirs

2.5.2

plotly

5.8.2

pmdarima

1.8.5

preshed

3.0.6

prometheus-client

0.11.0

prompt-toolkit

3.0.20

prophet

1.0.1

protobuf

3.19.4

psutil

5.8.0

psycopg2

2.9.3

ptyprocess

0.7.0

pyarrow

7.0.0

pyasn1

0.4.8

pyasn1-modules

0.2.8

pybind11

2.9.2

pycparser

2.20

pydantic

1.8.2

Pygments

2.10.0

PyGObject

3.36.0

PyJWT

2.4.0

PyMeeus

0.5.11

PyNaCl

1.5.0

pyodbc

4.0.31

pyparsing

3.0.4

pyrsistent

0.18.0

pystan

2.19.1.1

python-apt

2.0.0+ubuntu0.20.4.7

python-dateutil

2.8.2

python-editor

1.0.4

pytz

2021.3

PyWavelets

1.1.1

PyYAML

6.0

pyzmq

22.2.1

regex

2021.8.3

requests

2.26.0

requests-oauthlib

1.3.1

requests-unixsocket

0.2.0

rsa

4.8

s3transfer

0.5.2

scikit-learn

0.24.2

scipy

1.7.1

seaborn

0.11.2

Send2Trash

1.8.0

setuptools

58.0.4

setuptools-git

1.2

shap

0.40.0

simplejson

3.17.6

six

1.16.0

slicer

0.0.7

smart-open

5.2.1

smmap

5.0.0

spacy

3.3.1

spacy-legacy

3.0.9

spacy-loggers

1.0.2

spark-tensorflow-distributor

1.0.0

sqlparse

0.4.2

srsly

2.4.3

ssh-import-id

5.10

statsmodels

0.12.2

tabulate

0.8.9

tangled-up-in-unicode

0.1.0

tenacity

8.0.1

tensorboard

2.9.1

tensorboard-data-server

0.6.1

tensorboard-plugin-profile

2.8.0

tensorboard-plugin-wit

1.8.1

tensorflow-cpu

2.9.1

tensorflow-estimator

2.9.0

tensorflow-io-gcs-filesystem

0.26.0

termcolor

1.1.0

terminado

0.9.4

testpath

0.5.0

thinc

8.0.17

threadpoolctl

2.2.0

tokenizers

0.12.1

torch

1.11.0+cpu

torchvision

0.12.0+cpu

tornado

6.1

tqdm

4.62.3

traitlets

5.1.0

transformers

4.20.0

typer

0.4.2

typing-extensions

3.10.0.2

ujson

4.0.2

unattended-upgrades

0.1

urllib3

1.26.7

virtualenv

20.8.0

visions

0.7.4

wasabi

0.9.1

wcwidth

0.2.5

webencodings

0.5.1

websocket-client

1.3.1

Werkzeug

2.0.2

wheel

0.37.0

widgetsnbextension

3.6.0

wrapt

1.12.1

xgboost

1.5.2

zipp

3.6.0

Python libraries on GPU clusters

Library

Version

Library

Version

Library

Version

absl-py

1.0.0

Antergos Linux

2015.10 (ISO-Rolling)

argon2-cffi

20.1.0

astor

0.8.1

astunparse

1.6.3

async-generator

1.10

attrs

21.2.0

azure-core

1.22.1

azure-cosmos

4.2.0

backcall

0.2.0

backports.entry-points-selectable

1.1.1

bcrypt

3.2.2

bleach

4.0.0

blis

0.7.8

boto3

1.21.18

botocore

1.24.18

cachetools

5.2.0

catalogue

2.0.7

certifi

2021.10.8

cffi

1.14.6

chardet

4.0.0

charset-normalizer

2.0.4

click

8.0.3

cloudpickle

2.0.0

cmdstanpy

0.9.68

configparser

5.2.0

convertdate

2.4.0

cryptography

3.4.8

cycler

0.10.0

cymem

2.0.6

Cython

0.29.24

databricks-automl-runtime

0.2.9.1

databricks-cli

0.16.8

dbl-tempo

0.1.12

dbus-python

1.2.16

debugpy

1.4.1

decorator

5.1.0

defusedxml

0.7.1

dill

0.3.4

diskcache

5.4.0

distlib

0.3.4

distro-info

0.23ubuntu1

entrypoints

0.3

ephem

4.1.3

facets-overview

1.0.0

fasttext

0.9.2

filelock

3.3.1

Flask

1.1.2

flatbuffers

1.12

fsspec

2021.8.1

future

0.18.2

gast

0.4.0

gitdb

4.0.9

GitPython

3.1.27

google-auth

2.6.0

google-auth-oauthlib

0.4.6

google-pasta

0.2.0

grpcio

1.44.0

gunicorn

20.1.0

gviz-api

1.10.0

h5py

3.3.0

hijri-converter

2.2.4

holidays

0.14.2

horovod

0.24.3

htmlmin

0.1.12

huggingface-hub

0.8.1

idna

3.2

ImageHash

4.2.1

imbalanced-learn

0.8.1

importlib-metadata

4.8.1

ipykernel

6.12.1

ipython

7.32.0

ipython-genutils

0.2.0

ipywidgets

7.7.0

isodate

0.6.1

itsdangerous

2.0.1

jedi

0.18.0

Jinja2

2.11.3

jmespath

0.10.0

joblib

1.0.1

joblibspark

0.5.0

jsonschema

3.2.0

jupyter-client

6.1.12

jupyter-core

4.8.1

jupyterlab-pygments

0.1.2

jupyterlab-widgets

1.0.0

keras

2.9.0

Keras-Preprocessing

1.1.2

kiwisolver

1.3.1

korean-lunar-calendar

0.2.1

langcodes

3.3.0

libclang

14.0.1

lightgbm

3.3.2

llvmlite

0.38.1

LunarCalendar

0.0.9

Mako

1.2.0

Markdown

3.3.6

MarkupSafe

2.0.1

matplotlib

3.4.3

matplotlib-inline

0.1.2

missingno

0.5.1

mistune

0.8.4

mleap

0.20.0

mlflow-skinny

1.27.0

multimethod

1.8

murmurhash

1.0.7

nbclient

0.5.3

nbconvert

6.1.0

nbformat

5.1.3

nest-asyncio

1.5.1

networkx

2.6.3

nltk

3.6.5

notebook

6.4.5

numba

0.55.2

numpy

1.20.3

oauthlib

3.2.0

opt-einsum

3.3.0

packaging

21.0

pandas

1.3.4

pandas-profiling

3.1.0

pandocfilters

1.4.3

paramiko

2.9.2

parso

0.8.2

pathy

0.6.2

patsy

0.5.2

petastorm

0.11.4

pexpect

4.8.0

phik

0.12.2

pickleshare

0.7.5

Pillow

8.4.0

pip

21.2.4

platformdirs

2.5.2

plotly

5.8.2

pmdarima

1.8.5

preshed

3.0.6

prompt-toolkit

3.0.20

prophet

1.0.1

protobuf

3.19.4

psutil

5.8.0

psycopg2

2.9.3

ptyprocess

0.7.0

pyarrow

7.0.0

pyasn1

0.4.8

pyasn1-modules

0.2.8

pybind11

2.9.2

pycparser

2.20

pydantic

1.8.2

Pygments

2.10.0

PyGObject

3.36.0

PyJWT

2.4.0

PyMeeus

0.5.11

PyNaCl

1.5.0

pyodbc

4.0.31

pyparsing

3.0.4

pyrsistent

0.18.0

pystan

2.19.1.1

python-apt

2.0.0+ubuntu0.20.4.7

python-dateutil

2.8.2

python-editor

1.0.4

pytz

2021.3

PyWavelets

1.1.1

PyYAML

6.0

pyzmq

22.2.1

regex

2021.8.3

requests

2.26.0

requests-oauthlib

1.3.1

requests-unixsocket

0.2.0

rsa

4.8

s3transfer

0.5.2

scikit-learn

0.24.2

scipy

1.7.1

seaborn

0.11.2

Send2Trash

1.8.0

setuptools

58.0.4

setuptools-git

1.2

shap

0.40.0

simplejson

3.17.6

six

1.16.0

slicer

0.0.7

smart-open

5.2.1

smmap

5.0.0

spacy

3.3.1

spacy-legacy

3.0.9

spacy-loggers

1.0.2

spark-tensorflow-distributor

1.0.0

sqlparse

0.4.2

srsly

2.4.3

ssh-import-id

5.10

statsmodels

0.12.2

tabulate

0.8.9

tangled-up-in-unicode

0.1.0

tenacity

8.0.1

tensorboard

2.9.1

tensorboard-data-server

0.6.1

tensorboard-plugin-profile

2.8.0

tensorboard-plugin-wit

1.8.1

tensorflow

2.9.1

tensorflow-estimator

2.9.0

tensorflow-io-gcs-filesystem

0.26.0

termcolor

1.1.0

terminado

0.9.4

testpath

0.5.0

thinc

8.0.17

threadpoolctl

2.2.0

tokenizers

0.12.1

torch

1.11.0+cu113

torchvision

0.12.0+cu113

tornado

6.1

tqdm

4.62.3

traitlets

5.1.0

transformers

4.20.0

typer

0.4.2

typing-extensions

3.10.0.2

ujson

4.0.2

unattended-upgrades

0.1

urllib3

1.26.7

virtualenv

20.8.0

visions

0.7.4

wasabi

0.9.1

wcwidth

0.2.5

webencodings

0.5.1

websocket-client

1.3.1

Werkzeug

2.0.2

wheel

0.37.0

widgetsnbextension

3.6.0

wrapt

1.12.1

xgboost

1.5.2

zipp

3.6.0

Spark packages containing Python modules

Spark Package

Python Module

Version

graphframes

graphframes

0.8.2-db1-spark3.2

R libraries

The R libraries are identical to the R Libraries in Databricks Runtime 11.1.

Java and Scala libraries (Scala 2.12 cluster)

In addition to Java and Scala libraries in Databricks Runtime 11.1, Databricks Runtime 11.1 ML contains the following JARs:

CPU clusters

Group ID

Artifact ID

Version

com.typesafe.akka

akka-actor_2.12

2.5.23

ml.combust.mleap

mleap-databricks-runtime_2.12

0.20.0-db1

ml.dmlc

xgboost4j-spark_2.12

1.5.2

ml.dmlc

xgboost4j_2.12

1.5.2

org.graphframes

graphframes_2.12

0.8.2-db1-spark3.2

org.mlflow

mlflow-client

1.27.0

org.mlflow

mlflow-spark

1.27.0

org.scala-lang.modules

scala-java8-compat_2.12

0.8.0

org.tensorflow

spark-tensorflow-connector_2.12

1.15.0

GPU clusters

Group ID

Artifact ID

Version

com.typesafe.akka

akka-actor_2.12

2.5.23

ml.combust.mleap

mleap-databricks-runtime_2.12

0.20.0-db1

ml.dmlc

xgboost4j-spark_2.12

1.5.2

ml.dmlc

xgboost4j_2.12

1.5.2

org.graphframes

graphframes_2.12

0.8.2-db1-spark3.2

org.mlflow

mlflow-client

1.27.0

org.mlflow

mlflow-spark

1.27.0

org.scala-lang.modules

scala-java8-compat_2.12

0.8.0

org.tensorflow

spark-tensorflow-connector_2.12

1.15.0