January 2019

These features and Databricks platform improvements were released in January 2019.

Note

Releases are staged. Your Databricks account may not be updated until up to a week after the initial release date.

Upcoming change: Python 3 to become the default when you create clusters

January 29, 2019

When Databricks platform version 2.91 releases in mid-February, the default Python version for new clusters will switch from Python 2 to Python 3. Existing clusters will not change their Python versions, of course. But if you’ve been in the habit of taking the Python 2 default when you create new clusters, you’ll need to start paying attention to your Python version selection.

../../../_images/python-version-3.png

Databricks Runtime 5.2 for Machine Learning (Beta) release

January 24, 2019

Databricks Runtime 5.2 ML is built on top of Databricks Runtime 5.2. It contains many popular machine learning libraries, including TensorFlow, PyTorch, Keras, and XGBoost, and provides distributed TensorFlow training using Horovod. In addition to library updates since Databricks Runtime ML 5.1, Databricks Runtime 5.2 ML includes the following new features:

  • GraphFrames now supports the Pregel API (Python) with Databricks’s performance optimizations.
  • HorovodRunner adds:
    • On a GPU cluster, training processes are mapped to GPUs instead of worker nodes to simplify the support of multi-GPU instance types. This built-in support allows you to distribute to all of the GPUs on a multi-GPU machine without custom code.
    • HorovodRunner.run() now returns the return value from the first training process.

See the complete release notes for Databricks Runtime 5.2 ML (Beta).

Databricks Runtime 5.2 release

January 24, 2019

Databricks Runtime 5.2 is now available. Databricks Runtime 5.2 includes Apache Spark 2.4.0, new Delta Lake and Structured Streaming features and upgrades, and upgraded Python, R, Java, and Scala libraries. For details, see Databricks Runtime 5.2.

Cluster configuration JSON view

January 15-22, 2019

The cluster configuration page now supports a JSON view:

../../../_images/cluster-json-aws.png

The JSON view is read-only. However, you can copy the JSON and use it to create and update clusters with the Clusters API.

Library UI

January 2-9, 2019: Version 2.88

The library UI improvements that were originally released in November 2018 and reverted shortly thereafter have been re-released. These updates make it easier to upload, install, and manage libraries for your Databricks clusters.

The Databricks UI now supports both Workspace libraries and cluster-installed libraries. A Workspace library exists in the Workspace and can be installed on one or more clusters. A cluster-installed library is a library that exists only in the context of the cluster that it is installed on. In addition:

  • You can now create a library from a file uploaded to blob storage.
  • You can now install and uninstall libraries from the library details page and a cluster’s Libraries tab.
  • Libraries installed using the API now display in a cluster’s Libraries tab.

For details, see Libraries.

Cluster Events

January 2-9, 2019: Version 2.88

New cluster events were added to reflect Spark driver status. For details, see ClusterEventType.

Cluster UI

January 2-9, 2019: Version 2.88

Note

These updates will be rolled out to some Databricks customers in version 2.88 and the remainder in 2.89, which will be released during the third week of January.

The cluster creation page has been cleaned up and reorganized for ease of use, including a new Advanced Options toggle.

../../../_images/clusters-ui.gif