Skip to main content

August 2021

These features and Databricks platform improvements were released in August 2021.

note

Releases are staged. Your Databricks account may not be updated until a week or more after the initial release date.

Databricks Repos GA

August 31, 2021

Databricks Repos is now generally available. With Repos you can create new or clone existing Git repositories in Databricks, work with notebooks in these repositories, follow Git-based development and collaboration best practices, and integrate your CI/CD workflows with Repos APIs. Databricks Repos integrates with GitHub, Bitbucket, GitLab, and Azure DevOps. For details, see Databricks Git folders and Repos API.

Serverless SQL provides instant compute, minimal management, and cost optimization for SQL queries (Public Preview)

August 30, 2021

Until now, all Databricks computation happened in the compute plane in your AWS account. The initial release of Serverless compute adds Serverless SQL endpoints to Databricks SQL, moving those compute resources to your Databricks account.

You use serverless SQL warehouses with Databricks SQL queries just like you use the SQL endpoints that live in your own AWS account, now called Classic SQL endpoints. But serverless SQL warehouses typically start with low latency compared to Classic SQL endpoints, are easier to manage, and are optimized for cost.

Before you can create serverless SQL warehouses, an admin must enable the Serverless SQL endpoints option for your workspace. Once enabled, new SQL endpoints are Serverless by default, but you can continue to create SQL endpoints as Serverless or Classic as you like.

For more information, see the Serverless SQL blog article. For details about the Serverless compute architecture and comparisons with the classic compute plane, see Serverless compute plane. For details about configuring serverless SQL warehouses—including how to convert Classic SQL endpoints to Serverless—see Enable serverless SQL warehouses.

For the list of supported regions for serverless SQL warehouses, see Databricks clouds and regions.

important

Serverless compute is subject to applicable terms that must be accepted by a workspace admin in order to enable the feature.

User interface improvements for DLT (Public Preview)

Aug 23-30, 2021: Version 3.53

This release includes the following enhancements to the DLT UI:

When creating a pipeline, you can now directly enter a notebook path in the Create Pipeline dialog. Previously, you were restricted to using the Select a Notebook dialog to select notebooks. See the Lakeflow Spark Declarative Pipelines.
When you click the Spark UI, Logs, or Metrics links in the Pipeline Details page, the link now opens in a new tab.

More control over how tables are materialized in DLT pipelines (Public Preview)

Aug 23-30, 2021: Version 3.53

You can now specify a schema when defining a table with Python in DLT, providing more control over how tables are materialized. Previously, only schema inference was supported when defining tables. See the DLT Python language reference for more information on defining schemas.

Increased timeout for long-running notebook jobs

Aug 23-30, 2021: Version 3.53

You can now run notebook workflow jobs that take up to 30 days to complete. Previously, only notebook workflow jobs taking up to 48 hours to complete were supported. See Orchestrate notebooks and modularize code in notebooks for more information.

Jobs service stability and scalability improvements

Aug 23-30, 2021: Version 3.53

The following changes increase the stability and scalability of the Jobs service:

Each new job and run is assigned a longer, unique, numeric, non-sequential identifier. Clients that use the Jobs API and depend on a fixed identifier length or sequential or monotonically increasing identifiers must be modified to accept identifiers that are longer, non-sequential, and unordered. The identifier type of int64 remains unchanged, and compatibility is preserved for clients that use IEEE 754 64-bit floating-point numbers, for example, JavaScript clients.
The value of the number_in_job field, included in the response to some Jobs API requests, is now set to the same value as run_id.

note

This feature was delayed to February 2022.

User entitlements granted by group membership are displayed in the admin console

Aug 23-30, 2021: Version 3.53

User entitlements granted by group membership are now displayed for each user on the Users tab in the admin console.

Manage MLflow experiment permissions (Public Preview)

Aug 23-30, 2021: Version 3.53

You can now now manage the permissions of an MLflow experiment from the experiment page. For details, see Change permissions for an experiment.

Improved job creation from notebooks

Aug 23-30, 2021: Version 3.53

You can now edit and clone jobs associated with a notebook. For details, see Create and manage scheduled notebook jobs.

Improved support for collapsing notebook headings

Aug 23-30, 2021: Version 3.53

You can now collapse or expand all collapsible headings in a notebook. Previously, you could only collapse or expand a single heading at a time. For details, see Collapsible headings.

Databricks Runtime 9.0 and 9.0 ML are GA; 9.0 Photon is Public Preview

August 17, 2021

Databricks Runtime 9.0 and 9.0 ML are now generally available. 9.0 Photon is in Public Preview.

For information, see the full release notes at Databricks Runtime 9.0 (EoS) and Databricks Runtime 9.0 for ML (EoS).

Low-latency delivery of audit logs is generally available

August 17, 2021

Audit log delivery is now supported as a self-service configuration for accounts on the Premium plan or above. As a Databricks account owner, you can use the Account API to configure Databricks audit logs to be delivered to your preferred S3 storage location. In addition, if you have a multi-workspace Databricks deployment, you can create a single audit log delivery configuration that is shared by all workspaces in your account. The new audit log delivery framework logs events with low latency, typically delivering logs within less than 15 minutes of an auditable event.

Legacy audit logging is now deprecated.

For more information and migration instructions, see Configure audit log delivery.

Databricks Runtime 9.0 (Beta)

August 10, 2021

Databricks Runtime 9.0 and Databricks Runtime 9.0 ML are now available as Beta releases.

For information, see the full release notes at Databricks Runtime 9.0 (EoS) and Databricks Runtime 9.0 for ML (EoS).

Manage repos programmatically with the Databricks CLI (Public Preview)

Aug 9-16, 2021: Version 3.52

You can now manage remote Git repos by using the Databricks Command Line Interface (CLI). See Legacy Databricks CLI.

Manage repos programmatically with the Databricks REST API (Public Preview)

Aug 9-16, 2021: Version 3.52

You can now manage remote Git repos by using the Databricks REST API. See Repos API.

Databricks Runtime 7.6 series support ends

August 8, 2021

Support for Databricks Runtime 7.6, Databricks Runtime 7.6 for Machine Learning, and Databricks Runtime 7.6 for Genomics ended on August 8. See Databricks support lifecycles.

Log delivery APIs now report delivery status

Aug 4, 2021

Log delivery configurations in AWS now contain a log delivery status field named log_delivery_status. This provides you with the latest updates on log delivery attempts. It provides the last time that a log delivery was attempted. It helps you determine when delivery attempts succeed or fail, and it can help you troubleshoot delivery failures through various failure mode categories and detailed failure messages. See Configure audit log delivery.

Use the AWS EBS SSD gp3 volume type for all clusters in a workspace

August 9-16, 2021: Version 3.52

You can now select either gp2 or gp3 as the AWS EBS SSD volume type for all clusters in a Databricks workspace. Databricks recommends you use gp3 for its cost savings compared to gp2.

For information, see Manage SSD storage.

Audit events are logged when you interact with Databricks Repos

August 9-13, 2021: Version 3.52

When audit logging is enabled, an audit event is now logged when you create, update, or delete a Databricks Repo, when you list all Databricks Repos associated with a workspace, and when you sync changes between a Databricks Repo and a remote repo. For more information, see Git folder events.

Improved job creation and management workflow

August 9-13, 2021: Version 3.52

You can now view and manage jobs associated with a notebook. Specifically, you can start a job run, view the current or most recent run, pause or resume the job's schedule, and delete the job.

The notebook job creation UI has been revised and new configuration options added. For details, see Create and manage scheduled notebook jobs.

Simplified instructions for setting Git credentials (Public Preview)

August 9-13, 2021: Version 3.52

The instructions on the Git integration tab of the User Settings page have been simplified.

Import multiple notebooks in `.html` format

August 9-13, 2021: Version 3.52

You can now import multiple notebooks in .html format in a .zip file. Previously, you could only import a single notebook in .html format at a time.

The .zip file can contain folders and notebooks in either .html format or source file format (Python, Scala, SQL, or R). A .zip file cannot include both formats.

Usability improvements for DLT

August 9-13, 2021: Version 3.52

This release includes the following enhancements to the DLT runtime and UI:

When creating a pipeline, you can now specify a target database for publishing your DLT tables and metadata. See the Use Lakeflow Spark Declarative Pipelines with legacy Hive metastore for more information on publishing datasets.
Notebooks now support syntax highlighting for keywords in SQL dataset definitions. You can use this syntax highlighting to ensure the correctness of your DLT SQL statements. See the SQL language reference for details on the DLT SQL syntax.
The DLT runtime now emits your pipeline graph before running the pipeline, allowing you to see the graph in the UI sooner.
All Python libraries configured in your notebooks are now installed before running any Python code, ensuring that libraries are globally accessible to any Python notebook in your pipeline. See Manage Python dependencies for pipelines.

Configure Databricks for SSO with Microsoft Entra ID in your Azure tenant

August 3, 2021

You can now configure Databricks for SSO with Microsoft Entra ID in your Azure tenant. For more information, see SSO with Microsoft Entra ID for your workspace.

Databricks Repos GA
Serverless SQL provides instant compute, minimal management, and cost optimization for SQL queries (Public Preview)
User interface improvements for DLT (Public Preview)
More control over how tables are materialized in DLT pipelines (Public Preview)
Increased timeout for long-running notebook jobs
Jobs service stability and scalability improvements
User entitlements granted by group membership are displayed in the admin console
Manage MLflow experiment permissions (Public Preview)
Improved job creation from notebooks
Improved support for collapsing notebook headings
Databricks Runtime 9.0 and 9.0 ML are GA; 9.0 Photon is Public Preview
Low-latency delivery of audit logs is generally available
Databricks Runtime 9.0 (Beta)
Manage repos programmatically with the Databricks CLI (Public Preview)
Manage repos programmatically with the Databricks REST API (Public Preview)
Databricks Runtime 7.6 series support ends
Log delivery APIs now report delivery status
Use the AWS EBS SSD gp3 volume type for all clusters in a workspace
Audit events are logged when you interact with Databricks Repos
Improved job creation and management workflow
Simplified instructions for setting Git credentials (Public Preview)
Import multiple notebooks in .html format
Usability improvements for DLT
Configure Databricks for SSO with Microsoft Entra ID in your Azure tenant