August 2021
These features and Databricks platform improvements were released in August 2021.
Note
Releases are staged. Your Databricks account may not be updated until a week or more after the initial release date.
Databricks Repos GA
August 31, 2021
Databricks Repos is now generally available. With Repos you can create new or clone existing Git repositories in Databricks, work with notebooks in these repositories, follow Git-based development and collaboration best practices, and integrate your CI/CD workflows with Repos APIs. Databricks Repos integrates with GitHub, Bitbucket, GitLab, and Azure DevOps. For details, see Git integration for Databricks Git folders and Repos API.
Serverless SQL provides instant compute, minimal management, and cost optimization for SQL queries (Public Preview)
August 30, 2021
Until now, all Databricks computation happened in the compute plane in your AWS account. The initial release of Serverless compute adds Serverless SQL endpoints to Databricks SQL, moving those compute resources to your Databricks account.
You use serverless SQL warehouses with Databricks SQL queries just like you use the SQL endpoints that live in your own AWS account, now called Classic SQL endpoints. But serverless SQL warehouses typically start with low latency compared to Classic SQL endpoints, are easier to manage, and are optimized for cost.
Before you can create serverless SQL warehouses, an admin must enable the Serverless SQL endpoints option for your workspace. Once enabled, new SQL endpoints are Serverless by default, but you can continue to create SQL endpoints as Serverless or Classic as you like.
For more information, see the Serverless SQL blog article. For details about the Serverless compute architecture and comparisons with the classic compute plane, see Serverless compute plane. For details about configuring serverless SQL warehouses—including how to convert Classic SQL endpoints to Serverless—see Enable serverless SQL warehouses.
For the list of supported regions for serverless SQL warehouses, see Databricks clouds and regions.
Important
Serverless compute is subject to applicable terms that must be accepted by a workspace admin in order to enable the feature.
User interface improvements for Delta Live Tables (Public Preview)
Aug 23-30, 2021: Version 3.53
This release includes the following enhancements to the Delta Live Tables UI:
When creating a pipeline, you can now directly enter a notebook path in the Create Pipeline dialog. Previously, you were restricted to using the Select a Notebook dialog to select notebooks. See the What is Delta Live Tables?.
When you click the Spark UI, Logs, or Metrics links in the Pipeline Details page, the link now opens in a new tab.
More control over how tables are materialized in Delta Live Tables pipelines (Public Preview)
Aug 23-30, 2021: Version 3.53
You can now specify a schema when defining a table with Python in Delta Live Tables, providing more control over how tables are materialized. Previously, only schema inference was supported when defining tables. See the Delta Live Tables Python language reference for more information on defining schemas.
Increased timeout for long-running notebook jobs
Aug 23-30, 2021: Version 3.53
You can now run notebook workflow jobs that take up to 30 days to complete. Previously, only notebook workflow jobs taking up to 48 hours to complete were supported. See Orchestrate notebooks and modularize code in notebooks for more information.
Jobs service stability and scalability improvements
Aug 23-30, 2021: Version 3.53
The following changes increase the stability and scalability of the Jobs service:
Each new job and run is assigned a longer, unique, numeric, non-sequential identifier. Clients that use the Jobs API and depend on a fixed identifier length or sequential or monotonically increasing identifiers must be modified to accept identifiers that are longer, non-sequential, and unordered. The identifier type of
int64
remains unchanged, and compatibility is preserved for clients that use IEEE 754 64-bit floating-point numbers, for example, JavaScript clients.The value of the
number_in_job
field, included in the response to some Jobs API requests, is now set to the same value asrun_id
.
Note
This feature was delayed to February 2022.
User entitlements granted by group membership are displayed in the admin console
Aug 23-30, 2021: Version 3.53
User entitlements granted by group membership are now displayed for each user on the Users tab in the admin console.
Manage MLflow experiment permissions (Public Preview)
Aug 23-30, 2021: Version 3.53
You can now now manage the permissions of an MLflow experiment from the experiment page. For details, see Change permissions for an experiment.
Improved job creation from notebooks
Aug 23-30, 2021: Version 3.53
You can now edit and clone jobs associated with a notebook. For details, see Create and manage scheduled notebook jobs.
Improved support for collapsing notebook headings
Aug 23-30, 2021: Version 3.53
You can now collapse or expand all collapsible headings in a notebook. Previously, you could only collapse or expand a single heading at a time. For details, see Collapsible headings.
Databricks Runtime 9.0 and 9.0 ML are GA; 9.0 Photon is Public Preview
August 17, 2021
Databricks Runtime 9.0 and 9.0 ML are now generally available. 9.0 Photon is in Public Preview.
For information, see the full release notes at Databricks Runtime 9.0 (EoS) and Databricks Runtime 9.0 for ML (EoS).
Low-latency delivery of audit logs is generally available
August 17, 2021
Audit log delivery is now supported as a self-service configuration for accounts on the Premium plan or above. As a Databricks account owner, you can use the Account API to configure Databricks audit logs to be delivered to your preferred S3 storage location. In addition, if you have a multi-workspace Databricks deployment, you can create a single audit log delivery configuration that is shared by all workspaces in your account. The new audit log delivery framework logs events with low latency, typically delivering logs within less than 15 minutes of an auditable event.
Legacy audit logging is now deprecated.
For more information and migration instructions, see Configure audit log delivery.
Databricks Runtime 9.0 (Beta)
August 10, 2021
Databricks Runtime 9.0 and Databricks Runtime 9.0 ML are now available as Beta releases.
For information, see the full release notes at Databricks Runtime 9.0 (EoS) and Databricks Runtime 9.0 for ML (EoS).
Manage repos programmatically with the Databricks CLI (Public Preview)
Aug 9-16, 2021: Version 3.52
You can now manage remote Git repos by using the Databricks Command Line Interface (CLI). See Databricks CLI (legacy).
Manage repos programmatically with the Databricks REST API (Public Preview)
Aug 9-16, 2021: Version 3.52
You can now manage remote Git repos by using the Databricks REST API. See Repos API.
Databricks Runtime 7.6 series support ends
August 8, 2021
Support for Databricks Runtime 7.6, Databricks Runtime 7.6 for Machine Learning, and Databricks Runtime 7.6 for Genomics ended on August 8. See Databricks support lifecycles.
Log delivery APIs now report delivery status
Aug 4, 2021
Log delivery configurations in AWS now contain a log delivery status field named log_delivery_status
. This provides you with the latest updates on log delivery attempts. It provides the last time that a log delivery was attempted. It helps you determine when delivery attempts succeed or fail, and it can help you troubleshoot delivery failures through various failure mode categories and detailed failure messages. See Configure audit log delivery.
Use the AWS EBS SSD gp3 volume type for all clusters in a workspace
August 9-16, 2021: Version 3.52
You can now select either gp2 or gp3 as the AWS EBS SSD volume type for all clusters in a Databricks workspace. Databricks recommends you use gp3 for its cost savings compared to gp2.
For information, see Manage SSD storage.
Audit events are logged when you interact with Databricks Repos
August 9-13, 2021: Version 3.52
When audit logging is enabled, an audit event is now logged when you create, update, or delete a Databricks Repo, when you list all Databricks Repos associated with a workspace, and when you sync changes between a Databricks Repo and a remote repo. For more information, see Git folder events.
Improved job creation and management workflow
August 9-13, 2021: Version 3.52
You can now view and manage jobs associated with a notebook. Specifically, you can start a job run, view the current or most recent run, pause or resume the job’s schedule, and delete the job.
The notebook job creation UI has been revised and new configuration options added. For details, see Create and manage scheduled notebook jobs.
Simplified instructions for setting Git credentials (Public Preview)
August 9-13, 2021: Version 3.52
The instructions on the Git integration tab of the User Settings page have been simplified.
Import multiple notebooks in .html
format
August 9-13, 2021: Version 3.52
You can now import multiple notebooks in .html
format in a .zip
file. Previously, you could only import a single notebook in .html
format at a time.
The .zip
file can contain folders and notebooks in either .html
format or source file format (Python, Scala, SQL, or R). A .zip
file cannot include both formats.
Usability improvements for Delta Live Tables
August 9-13, 2021: Version 3.52
This release includes the following enhancements to the Delta Live Tables runtime and UI:
When creating a pipeline, you can now specify a target database for publishing your Delta Live Tables tables and metadata. See the Use Delta Live Tables pipelines with legacy Hive metastore for more information on publishing datasets.
Notebooks now support syntax highlighting for keywords in SQL dataset definitions. You can use this syntax highlighting to ensure the correctness of your Delta Live Tables SQL statements. See the SQL language reference for details on the Delta Live Tables SQL syntax.
The Delta Live Tables runtime now emits your pipeline graph before running the pipeline, allowing you to see the graph in the UI sooner.
All Python libraries configured in your notebooks are now installed before running any Python code, ensuring that libraries are globally accessible to any Python notebook in your pipeline. See Manage Python dependencies for Delta Live Tables pipelines.
Configure Databricks for SSO with Microsoft Entra ID in your Azure tenant
August 3, 2021
You can now configure Databricks for SSO with Microsoft Entra ID in your Azure tenant. For more information, see SSO with Microsoft Entra ID for your workspace.