August 2020

These features and Databricks platform improvements were released in August 2020.

Note

Releases are staged. Your Databricks account may not be updated until up to a week after the initial release date.

Token Management API is GA and admins can use the Admin Console to grant and revoke user access to tokens

August 26 - September 1, 2020: Version 3.27

Token management is now generally available. Databricks administrators can use the Token Management API and the Admin Console to manage their users’ Databricks personal access tokens. As an admin, you can:

  • Monitor and revoke users’ personal access tokens.
  • Control the lifetime of future tokens in your workspace.
  • Control which users can create and use tokens via the Permissions API or in the Admin Console.

In the transition from Public Preview to GA, the Token Management API parameter created_by was changed to created_by_id, and a new parameter, created_by_username was added.

For more information, see Manage personal access tokens.

Message size limits for Shiny apps increased

August 26 - September 1, 2020: Version 3.27

The maximum application size for Shiny apps has been increased from 10 MB to 20 MB. If your application’s total size exceeds this limit, review the recommendations in the Shiny FAQ.

Improved instructions for setting up a cluster in local mode

August 26 - September 1, 2020: Version 3.27

In the cluster UI:

  • If you create a cluster with 0 workers, a tool tip appears recommending that you use local mode and showing the associated configuration setting (spark.master local[*]).
  • You can no longer set spark.master local[*] for a cluster, unless the cluster has 0 workers.

View version of notebook associated with a run

August 26 - September 1, 2020: Version 3.27

From the Experiments sidebar, you can now display the version of a notebook associated with a run. For details, see View notebook experiment.

Databricks Runtime 7.2 GA

August 20, 2020

Databricks Runtime 7.2 brings many additional features and improvements over Databricks Runtime 7.1, including:

  • Auto Loader is generally available: Auto Loader is an efficient method for incrementally ingesting a large number of files into Delta Lake. It is now GA and adds the following features:

    • Directory listing mode option: Auto Loader adds a new directory listing mode, in addition to the existing file notification mode, for determining when there are new files.
    • Cloud resource management API: You can now use our Scala API to manage cloud resources created by Auto Loader. You can list notification services and tear down specific notification services using this API.
    • Rate limiting option: You can now use the cloudFiles.maxBytesPerTrigger option to limit the amount of data processed in each microbatch.
    • Option validation: Auto Loader now validates the options you provide.validation will fail. To skip option validation, set cloudFiles.validateOptions to false.
  • Efficiently copy a Delta table with clone.

  • Improvements:

    • S3 storage connector has been updated to the version supported in Hadoop 3.1.3, providing stability improvements.
    • Snowflake connector has been upgraded to version 2.8.1, which includes Spark 3.0 support.
    • Credential passthrough improvements
    • TensorBoard improvements
    • Upgraded Python and R libraries

For details, see the complete Databricks Runtime 7.2 (Unsupported) release notes.

Databricks Runtime 7.2 ML GA

August 20, 2020

Databricks Runtime 7.2 for Machine Learning is built on top of Databricks Runtime 7.2 and brings new and improved Python and system libraries. For details, see the complete Databricks Runtime 7.2 for Machine Learning (Unsupported) release notes.

Databricks Runtime 7.2 Genomics GA

August 20, 2020

Databricks Runtime 7.2 for Genomics is built on top of Databricks Runtime 7.2 and significantly speeds up the conversion of literal numpy 1D and 2D float-typed ndarrays to Java arrays. The Glow genome-wide association study documentation reflects the usage.

For details, see the complete Databricks Runtime 7.2 for Genomics (Unsupported) release notes.

Permissions API (Public Preview)

August 18, 2020

Databricks is pleased to announce the public preview of the Permissions API, which lets you manage permissions for:

  • Tokens
  • Passwords when SSO is enabled
  • Clusters
  • Pools
  • Jobs
  • Notebooks
  • Folders (directories)
  • MLflow registered models

For more information, see Permissions API.

Databricks Connect 7.1 (GA)

August 12, 2020

Databricks Connect now supports Databricks Runtime 7.1.

In Databricks Runtime 7.1, Databricks recommends that you always use the most recent version of Databricks Connect.

Repeatable installation order for cluster libraries

August 12-25, 2020: Version 3.26

On a cluster running Databricks Runtime 7.2 or above, Databricks now processes all cluster libraries in the order that they were installed.

Customer-managed VPC is GA

August 12-25, 2020: Version 3.26

Introduced as a Public Preview in June, the customer-managed VPC feature is now generally available for customers on the E2 version of the Databricks platform. By default, clusters are created in a single AWS VPC (Virtual Private Cloud) that Databricks creates and configures in your AWS account. Creating Databricks workspaces in your own VPC allows you to exercise more control over the infrastructure and can help you comply with specific cloud security and governance standards your organization may require. You simply provide your VPC ID, security group ID, and subnet IDs when you create your workspace using the Multi-workspace API (Account API).

For GA, region support for us-east-2 has been added.

For more information, see Customer-managed VPC.

This feature is available only on the E2 version of the Databricks platform, not on the existing enterprise platform. Contact your Databricks representative to request access.

Secure cluster connectivity (no public IPs) is GA

August 12-25, 2020: Version 3.26

Introduced as a Public Preview in June, Secure cluster connectivity lets you launch clusters in which all nodes have only private IP addresses, providing enhanced security. Secure cluster connectivity is now generally available in workspaces that are created in a customer-managed VPC using the Multi-workspace API (Account API).

For more information, see Secure cluster connectivity.

This feature is available only on the E2 version of the Databricks platform, not on the existing enterprise platform. Contact your Databricks representative to request access.

Multi-workspace API (Account API) adds pricing tier

August 12-25, 2020: Version 3.26

The pricing tier (Standard, Premium, Enterprise, and so on) is now returned when you create, patch, or get workspaces using the Multi-workspace API (Account API) (Public Preview). The workspace defaults to the pricing tier associated with your account.

Create model from MLflow registered models page (Public Preview)

August 12-25, 2020: Version 3.26

You can now create a new model from the MLflow registered models page. For details, see Create a new registered model and assign a logged model to it.

Databricks Container Services supports GPU images

August 12-25, 2020: Version 3.26

You can now use Databricks Container Services on clusters with GPUs to create portable deep learning environments with customized libraries.

For details, see Databricks Container Services on GPU clusters.