October 2019

These features and Databricks platform improvements were released in October 2019.

Note

Releases are staged. Your Databricks account may not be updated until up to a week after the initial release date.

Databricks Runtime 6.1 for Genomics GA

October 22, 2019

Databricks Runtime 6.1 for Genomics is generally available. See Databricks Runtime for Genomics.

Databricks Runtime 6.1 for Machine Learning GA

October 22, 2019

Databricks Runtime 6.1 ML is generally available. It includes support for GPU clusters and upgrades to the following machine learning libraries:

  • TensorFlow to 1.14.0
  • PyTorch to 1.2.0
  • Torchvision to 0.4.0
  • MLflow to 1.3.0

For more information, see the complete Databricks Runtime 6.1 ML (Unsupported) release notes.

MLflow API calls are now rate limited

October 22 - 29, 2019: Version 3.5

To ensure high quality of service under heavy load, Databricks now enforces API rate limits for all MLflow API calls. The limits are set per account to ensure fair usage and high availability for all organizations sharing a workspace.

The MLflow clients with automatic retries are available in MLflow 1.3.0 and are in Databricks Runtime 6.1 ML (Unsupported). We advise all customers to switch to the latest MLflow client version.

For details, see MLflow REST API.

Pools of instances for quick cluster launch generally available

October 22 - 29, 2019: Version 3.5

The Databricks feature that supports attaching a cluster to a predefined pool of idle instances is now generally available.

Databricks does not charge DBUs while instances are idle in the pool. Instance provider billing does apply; see pricing.

For details, see Pools.

New instance types (Beta)

October 22 - 29, 2019: Version 3.5

Databricks now provides beta support for the r5d and m5d series of EC2 instances for workloads that require access to NVMe SSD storage.

Databricks Runtime 6.1 GA

October 16, 2019

Databricks Runtime 6.1 brings several enhancements to Delta Lake:

  • Easily convert tables to Delta Lake format
  • Python APIs for Delta tables (Public Preview)
  • Dynamic File Pruning (DFP) enabled by default

Note

Starting with the 6.1 release, Databricks Runtime supports only CPU clusters. If you want to use GPU clusters, you must use Databricks Runtime ML.

For more information, see the complete Databricks Runtime 6.1 (Unsupported) release notes.

Databricks Runtime 6.0 for Genomics GA

October 16, 2019

Databricks Runtime for Genomics (Databricks Runtime Genomics) is a variant of Databricks Runtime optimized for working with genomic and biomedical data. Beginning with release 6.0, Databricks Runtime for Genomics is generally available.

Non-admin Databricks users can read user and group names and IDs using SCIM API

October 8 - 15, 2019: Version 3.4

Non-admin users can now invoke the SCIM API Get Users and Get Groups endpoints to read user and group display names and IDs only. All other SCIM API operations continue to require administrator access.

Workspace API returns notebook and folder object IDs

October 8 - 15, 2019: Version 3.4

The get-status and list endpoints of the Workspace API now return notebook and folder object IDs, giving you the ability to reference those objects in other API calls.

Databricks Runtime 6.0 ML GA

October 4, 2019

Databricks Runtime 6.0 ML includes the following updates:

  • MLflow
    • A new Spark data source for MLflow experiments now provides a standard API to load MLflow experiment run data.
    • Added MLflow Java Client
    • MLflow is now promoted as a top-tier library
  • Hyperopt GA - Notable improvements since public preview include support for MLflow logging on Spark workers, correct handling of PySpark broadcast variables, as well as a new guide on model selection using Hyperopt.
  • Upgraded Horovod and MLflow libraries and Anaconda distribution.

Note

Only CPU clusters are supported in this release.

For more information, see the complete Databricks Runtime 6.0 ML (Unsupported) release notes.

Databricks Runtime 6.0 GA

October 1, 2019

Databricks Runtime 6.0 brings many library upgrades and new features, including:

  • New Scala and Java APIs for Delta Lake DML commands, as well as the vacuum and history utility commands.
  • Enhanced DBFS FUSE client for faster and more reliable reads and writes during model training.
  • Support for multiple matplotlib plots per notebook cell.
  • Update to Python 3.7, as well as updated numpy, pandas, matplotlib, and other libraries.
  • Sunset of Python 2 support.

Note

Only CPU clusters are supported in this release.

For more information, see the complete Databricks Runtime 6.0 (Unsupported) release notes.

Account usage reports now show usage by user name

October 1, 2019

Account usage reports available for download on the Account Console Usage Overview tab now let account owners identify usage by user name, not just user ID, which can help streamline usage distribution analysis and chargebacks.