July 2024

These features and Databricks platform improvements were released in July 2024.

Note

Releases are staged. Your Databricks account might not be updated until a week or more after the initial release date.

Increased limit for simultaneous tasks

July 31, 2024

The workspace limit for tasks running simultaneously has been raised to 2000. See Resource limits.

Embed and drag & drop images in notebooks

July 31, 2024

You can now display images in notebooks by embedding them directly in markdown cells. Drag and drop images from your desktop directly into markdown cells to automatically upload and display them. See Display images and Drag and drop images.

Command palette available in notebooks

July 31, 2024

You can now quickly perform actions in the notebook using the command palette. Press Cmd + Shift + P on MacOS or Ctrl + Shift + P on Windows while in a notebook to access frequently used actions. See Command palette.

Workflow system schema renamed to lakeflow

July 31, 2024

The workflow schema is being updated to lakeflow. We recommend that you switch to lakeflow as it will include all the current tables plus new ones in the future, like pipelines. Customers must opt-in to the lakeflow schema to make it visible in their metastore. See Jobs system table reference.

LakeFlow Connect (gated Public Preview)

July 31, 2024

LakeFlow Connect offers native connectors that enable you to ingest data from databases and enterprise applications and load it into Databricks. LakeFlow Connect leverages efficient incremental reads and writes to make data ingestion faster, scalable, and more cost-efficient, while your data remains fresh for downstream consumption.

Salesforce Sales Cloud, Microsoft Azure SQL Database, Amazon RDS for SQL Server, and Workday are currently supported. See LakeFlow Connect.

Support for Cloudflare R2 storage is GA

July 30, 2024

The ability to use Cloudflare R2 as cloud storage for data registered in Unity Catalog is now generally available. Cloudflare R2 is intended primarily for Delta Sharing use cases in which you want to avoid the data egress fees charged by cloud providers when data crosses regions. R2 storage supports all of the Databricks data and AI assets supported in AWS S3, Azure Data Lake Storage Gen2, and Google Cloud Storage. Support for R2 requires a SQL Warehouse or Databricks Runtime 14.3 or above. See Use Cloudflare R2 replicas or migrate storage to R2 and Create a storage credential for connecting to Cloudflare R2.

Monitor Databricks Assistant activities with system tables (Public Preview)

July 30, 2024

You can now monitor Databricks Assistant activities in a dashboard by using system tables. For more information, see Databricks Assistant system table reference and example.

Sharing schemas using Delta Sharing is now GA

July 30, 2024

The ability to share schemas using Delta Sharing is GA. Sharing entire schema gives the recipient access to all of the tables and views in the schema at the moment you share it, along with any tables and views that are added to the schema in the future. Adding schemas to a share using SQL commands requires a SQL warehouse or a cluster running Databricks Runtime 13.2 or above. Doing the same using Catalog Explorer has no compute requirements. See Add schemas to a share.

Mosaic AI Agent Framework is available in eu-central-1

July 29, 2024

Mosaic AI Agent Framework is now available in the eu-central-1. See Features with limited regional availability.

Databricks Assistant can diagnose issues with jobs (Public Preview)

July 29, 2024

Databricks Assistant can now diagnose issues with failed jobs. See Diagnose errors in jobs.

Updates to Databricks Git folders authentication and sharing behaviors

July 29, 2024

  • Git folder dialog-based authentication handling: The user experience is now streamlined to assist users in recovering from authentication errors when opening the Git folder dialog. In the dialog, you can update Git credentials directly, which triggers an automatic retry. You can use this approach to assist in resolving authentication errors.

    • When an authentication error occurs, the Git folder dialog now shows the Git folder’s provider and URL in the error. Previously this was hidden, making it difficult to know which Git credential should be used to resolve the error.

  • Git folder sharing: Users can now share a URL link to other workspace users. When the URL is opened in the recipient’s browser, Databricks opens and launches the existing Add Git folder dialog with pre-filled values (such as the Git provider and the Git repository URL). This simplifies Git folder cloning for commonly used Git repositories among your workspace users. See Best practice: Collaborating in Git folders for more details. - Users are now prompted to create their own Git folders in their own workspace rather than working collaboratively in another user’s Git folder. - The Git folder dialog state is now persisted in your URL. If you copy the URL from your browser when the Git folder dialog is open, it can be opened later or shared with another user and the same information will be displayed.

  • Git folder diff view: In the Git folder diff view, darker red and green highlighting was added for replaced text and for multiple lines of changes, making it easier to determine what was changed across their uncommitted changes.

    • Opening the Git folder dialog from a notebook or file editor selects that notebook or file in the Git folder dialog and displays the changes (diffs) by default.

Cluster library installation timeout

July 29, 2024

Library installation on clusters now has a timeout of 2 hours. A library that has taken more than 2 hours to install will be marked as failed. For information on cluster libraries, see Cluster libraries.

Compute plane outbound IP addresses must be added to a workspace IP allow list

July 29, 2024

When you configure IP access lists on a new workspace, you must add to an allowlist all public IPs that the compute plane uses to access the control plane or configure back-end PrivateLink. This change will impact all new workspaces on July 29, 2024, and existing workspaces on August 26, 2024. For more information, see the Databricks Community post.

For example, when you configure a customer-managed VPC, subnets must have outbound access to the public network using a NAT gateway or a similar approach. Those public IPs must be included in an allowlist. See Subnets. Alternatively, if you use a Databricks-managed VPC and configure the managed NAT gateway to access public IPs, those IPs must be in an allowlist.

See Configure IP access lists for workspaces.

Databricks Runtime 9.1 series support extended

July 26, 2024

Support for Databricks Runtime 9.1 LTS and Databricks Runtime 9.1 LTS for Machine Learning has been extended from September 23, 2024 to December 19, 2024.

Single sign-on (SSO) is supported in Lakehouse Federation for SQL Server

July 25, 2024

Unity Catalog now allows you to create SQL Server connections using SSO authentication. See Run federated queries on Microsoft SQL Server.

Enable cross-Geo processing

July 26, 2024

Account admins can now enable cross-Geo processing to allow data processing in data centers outside of a workspace Geo for Designated Services. If a Designated Service is not available in your workspace Geo, an account admin may be able to use the feature by explicitly giving permission to process relevant data in another Geo. See Enable cross-geo processing.

Model sharing using Delta Sharing is now generally available

July 26, 2024

Delta Sharing support for AI model sharing is now GA. Both the provider and recipient workspaces must be enabled for Unity Catalog, and models must be registered in Unity Catalog.

See Add models to a share.

Share comments and primary key constraints using Delta Sharing

July 25, 2024

Delta Sharing now supports the sharing of object metadata, including comments and primary key constraints:

  • Model comments and model version comments have been included in Databricks-to-Databricks shares for some time, but not announced.

  • Table comments, column comments, primary key constraints, and volume comments are now included in Databricks-to-Databricks shares that were shared with the recipient on or after July 25, 2024.

    If you want to include comments or constraints in a share that was shared with a recipient before that date, you must revoke and re-grant recipient access to trigger comment and constraint sharing.

See Create and manage shares for Delta Sharing.

New Databricks JDBC Driver (OSS)

July 25, 2024

A new open-source Databricks JDBC driver has been released for Public Preview. This driver has implemented the JDBC APIs and provides other core functionality including OAuth, Cloud Fetch, and features such as Unity Catalog volume ingestion. For more information, see Databricks JDBC Driver (OSS).

Databricks Runtime 15.4 LTS (Beta)

July 23, 2024

Databricks Runtime 15.4 LTS and Databricks Runtime 15.4 LTS ML are now available as Beta releases.

See Databricks Runtime 15.4 LTS and Databricks Runtime 15.4 LTS for Machine Learning.

Scala is GA on Unity Catalog shared compute

July 23, 2024

In Databricks Runtime 15.4 LTS and above, Scala is generally available on shared access mode Unity Catalog-enabled compute, including support for scalar user-defined functions (UDFs). Structured Streaming, Hive UDFs, and Hive user-defined aggregate functions are not supported. For a complete list of limitations, see Compute access mode limitations for Unity Catalog.

Single user compute supports fine-grained access control, materialized views, and streaming tables

July 23, 2024

Databricks Runtime 15.4 LTS introduces support for fine-grained access control on single user compute, as long as the workspace is enabled for serverless compute. When a query accesses any of the following, the single user compute resource on Databricks Runtime 15.4 LTS passes the query to serverless compute to run data filtering:

  • Views built over tables on which the user does not have the SELECT privilege

  • Dynamic views

  • Tables with row filters or column masks applied

  • Materialized views and streaming tables

These queries are unsupported on single user compute running on Databricks Runtime 15.3 and below.

For more information, see Fine-grained access control on single user compute.

Node timeline system table is now available (Public Preview)

July 23, 2024

The system.compute schema now includes a node_timeline table. This table logs minute-by-minute utilization metrics for the all-purpose and jobs compute resources run in your account. See Node timeline table schema.

Note

To access this table, an admin must enable the compute schema if you have not already. See Enable system table schemas.

Meta Llama 3.1 is now supported in Model Serving

July 23, 2024

Mosaic AI Model Serving has partnered with Meta to support Meta Llama 3.1, a model architecture built and trained by Meta. Llama 3.1 is supported as part of Foundation Model APIs. See Use Foundation Model APIs.

  • Meta-Llama-3.1-405B-Instruct and Meta-Llama-3.1-70B-Instruct are available in pay-per-token serving endpoint regions.

  • Production usage of the full suite of Llama 3.1 models (8B, 70B, and 405B) is available in the US using provisioned throughput.

Starting July 23, 2024, Meta-Llama-3.1-70B-Instruct replaces support for Meta-Llama-3-70B-Instruct in Foundation Model APIs pay-per-token endpoints.

Unity Catalog will soon drop support for storage credentials that use non-self-assuming IAM roles

July 22, 2024

Starting on September 20, 2024, Databricks will require that AWS IAM roles for new storage credentials be self-assuming. On January 20, 2025, Databricks will enforce this requirement on all existing storage credentials. Storage credentials that violate this requirement will cease to work, which might cause dependent workloads and jobs to fail. To learn more about this requirement and how to check and update your storage credentials, see Self-assuming role enforcement policy.

Notebooks: toggle more visible cell titles

July 18, 2024

Users can enable Show promoted cell titles in their developer settings to make notebook cell titles more visible in the UI. See Promoted cell titles

/ in workspace asset names is deprecated

July 17, 2024

To avoid ambiguity in path strings, the use of ‘/’ in the names of new workspace assets (such as notebooks, folders, and queries) has been deprecated. Existing assets with ‘/’ in their names are not affected, but renaming of existing assets follows the same rules as new assets.

Delta Sharing lets you share tables that use liquid clustering

July 16, 2024

Delta Sharing now lets you share tables that are enabled for liquid clustering, and recipients can run batch queries against them. Liquid clustering simplifies data layout decisions and optimizes query performance. See Use liquid clustering for Delta tables and Delta Lake feature support matrix.

Query history system table is now available (Public Preview)

July 16, 2024

Databricks system tables now include a query history table. This table logs detailed records of each query run on a SQL warehouse in your account. To access the table, admins must enable the new query system schema. See Query history system table reference.

Vulnerability scan reports are now emailed to admins

July 16, 2024

Vulnerability scan reports are now emailed to workspace admins in workspaces that enable enhanced security monitoring. Previously, workspace admins had to request them from Databricks. See Enhanced security monitoring.

Partition metadata logging for Unity Catalog external tables

July 15, 2024

In Databricks Runtime 13.3 LTS and above, you can optionally enable partition metadata logging for external tables registered to Unity Catalog that use Parquet, ORC, CSV, or JSON. Partition metatdata logging is a partition discovery strategy consistent with Hive metastore . See Partition discovery for external tables.

Serverless compute for workflows is GA

July 15, 2024

Serverless compute for workflows is now generally available. Serverless compute for workflows allows you to run your Databricks job without configuring and deploying infrastructure. With serverless compute for workflows, Databricks efficiently manages the compute resources that run your job, including optimizing and scaling compute for your workloads. See Run your Databricks job with serverless compute for workflows.

Serverless compute for notebooks is GA

July 15, 2024

Serverless compute for notebooks is now generally available. Serverless compute for notebooks gives you on-demand access to scalable compute in notebooks, letting you immediately write and run your Python or SQL code. See Serverless compute for notebooks.

Databricks Connect for Python now supports serverless compute

July 15, 2024

Databricks Connect for Python now supports connecting to serverless compute. This feature is available in Public Preview. See Configure a connection to serverless compute.

Filter data outputs using natural language prompts

July 11, 2024

You can now use the Databricks Assistant to filter data outputs using natural language prompts. For instance, to filter the Titanic survivors data table, you could type “Show me only males over 70.” See Filter data with natural language prompts.

Plaintext secrets support for external models

July 11, 2024

You can now directly input API keys as plaintext strings to model serving endpoints that host external models. See Configure the provider for an endpoint.

Forecast time series data using ai_forecast()

July 11, 2024

AI Functions now supports ai_forecast(), a new Databricks SQL function for analysts and data scientists designed to extrapolate time series data into the future. See ai_forecast function.

SQL File task support for files with multi-statement SQL queries is GA

July 10, 2024

Support for using files that contain multi-statement SQL queries with the SQL File task is now generally available. This change allows you to run multiple SQL statements from a single file. Previously, you needed to add a separate file for each statement. To learn more about the SQL File task, see SQL task for jobs.

Lakehouse Federation supports Salesforce Data Cloud (Public Preview)

July 10, 2024

You can now run federated queries on data managed by Salesforce Data Cloud. See Run federated queries on Salesforce Data Cloud.

Databricks Assistant system table now available (Public Preview)

July 10, 2024

Databricks Assistant events are now logged in a system table located at system.access.assistant_events. See Databricks Assistant system table reference and example.

Account SCIM API v2.1 (Public Preview)

July 10, 2024

The Account SCIM APIs are updated from v2.0 to v2.1 for speed and reliability. You can download a PDF of the Account SCIM v2.1 API reference.

Foundation Model Fine-tuning available to all us-west-2 customers (Public Preview)

July 10, 2024

Foundation Model Fine-tuning, now part of Mosaic AI Model Training, is now available to all customers in the us-west-2 region. Customers no longer need to request access to use this feature in this region.

With Foundation Model Fine-tuning, you use your own data to customize a foundation model to optimize its performance for your specific application. By fine-tuning or continuing training of a foundation model, you can train your own model using significantly less data, time, and compute resources than training a model from scratch. See Foundation Model Fine-tuning.

UK Cyber Essentials Plus compliance controls

July 10, 2024

UK Cyber Essentials Plus (UKCE+) controls provide enhancements that help you with cyber essentials compliance for your workspace. UKCE+ is a certification created by the UK government to simplify and standardize IT security practices for commercial organizations who interact with UK government data. See UK Cyber Essentials Plus compliance controls.

End of life for Databricks-managed passwords

July 10, 2024

Starting on July 10, 2024, you can no longer use Databricks-managed passwords to authenticate to the Databricks UI or APIs, known as basic authentication. If you do not have single sign-on configured, users now receive a unique code via email to log in. For automation, Databricks recommends using OAuth authentication. You can also authenticate with personal access tokens.

See End of life for Databricks-managed passwords.

Sign-in with one-time passcodes and external accounts

July 10, 2024

You can now allow users to sign in to Databricks using one-time passcodes or common external accounts, such as Google or Microsoft. See Sign-in with email or external accounts.

Resource quota increase for tables per Unity Catalog metastore

July 3, 2024

Your Unity Catalog metastore can now register up to one million tables. See Resource quotas.

Databricks Assistant can diagnose notebook errors automatically

July 2, 2024

Databricks Assistant can now run /fix in notebooks automatically when it detects an error message. Assistant uses generative AI to analyze your code and the error message to suggest a fix directly in your notebook. For more information, see Debug code: Python and SQL examples.

Support for the :param syntax with the SQL file task is GA

July 1, 2024

Support for using the :param syntax with parameterized queries in the Databricks Jobs SQL File task is generally available. You can now reference query parameters by prefixing their names with a colon (:parameter_name). This syntax is in addition to the existing support for the double curly braces ({{parameter_name}}) syntax. To learn more about using parameterized queries with the SQL File task, see Configure task parameters.

OAuth in Databricks on AWS GovCloud

July 1, 2024

OAuth authentication is now available in Databricks on AWS GovCloud. See Authenticate access to Databricks with a service principal using OAuth (OAuth M2M).