What’s coming?

Learn about upcoming Databricks releases.

Behavior change for working with variant data type

Databricks is blocking support for using fields with the variant data type in comparisons perfomed as part of the following operators and clauses:

  • DISTINCT

  • INTERSECT

  • EXCEPT

  • UNION

  • DISTRIBUTE BY

The same holds for these DataFrame functions:

  • df.dropDuplicates()

  • df.repartition()

    Databricks does not support these operators and functions for variant data type comparisons because they produce non-deterministic results.

These expressions will be blocked when using variant in Databricks Runtime 16.1 and above. Maintenance releases will block support in Databricks Runtime 15.3 and above.

If you use VARIANT type in your Databricks workloads or tables, take the following recommended actions:

  1. Find the queries that use variant with any of the listed operators.

  2. Update these queries using recommended patterns that explicitly cast variant values to non-variant types.

The following table provides examples of existing unintended functionality and recommended workarounds:

Unintended use

Recommended use

SELECT distinct(variant_expr) FROM ...

SELECT distinct(variant_expr?::string) FROM ...

SELECT variant_expr FROM ... EXCEPT SELECT variant_expr FROM ...

SELECT variant_expr?::string FROM ... EXCEPT SELECT variant_expr?::string FROM ...

Note

For any fields you plan to use for comparison or distinct operations, Databricks recommends extracting these fields from the variant column and storing them using non-variant types.

See Query variant data. Contact your Databricks account representative if you require additional support or advisement.

Update to Databricks Marketplace and Partner Connect UI

We are simplifying the sidebar by merging Partner Connect and Marketplace into a single Marketplace link. The new Marketplace link will be higher on the sidebar.

Marketplace and Partner Connect.

IPYNB notebooks will become the default notebook format for Databricks on December 2024

Currently, Databricks creates all new notebooks in the “Databricks source format” by default. In December 2024, the new default notebook format will be IPYNB (.ipynb). This new default can be changed by the user in the workspace user Settings pane if they prefer the Databricks source format.

Workspace files will be enabled for all Databricks workspaces on Feb 1, 2025

Databricks will enable workspace files for all Databricks workspaces on February 1, 2025. This change unblocks workspace users from using new workspace file features. After February 1, 2025, you won’t be able to disable workspace files using the enableWorkspaceFilesystem property with the Databricks PATCH workspace-conf/setstatus REST API. For more details on workspace files, see What are workspace files?.

Tables are shared with history by default in Delta Sharing

Databricks plans to change the default setting for tables shared using Delta Sharing to include history by default. Previously, history sharing was disabled by default. Sharing table history improves read performance and provides automatic support for advanced Delta optimizations.

Predictive optimization enabled by default on all new Databricks accounts

On November 11, Databricks will enable predictive optimization as the default for all new Databricks accounts. Previously, it was disabled by default and could be enabled by your account administrator. When predictive optimization is enabled, Databricks automatically runs maintenance operations for Unity Catalog managed tables. For more information on predictive optimization, see Predictive optimization for Unity Catalog managed tables.

Reduced cost and more control over performance vs. cost for your serverless compute for workflows workloads

In addition to the currently supported automatic performance optimizations, enhancements to the serverless compute for workflows optimization features will give you more control over whether workloads are optimized for performance or cost. To learn more, see Cost savings on serverless compute for Notebooks, Jobs, and Pipelines.

Changes to legacy dashboard version support

Databricks recommends using AI/BI dashboards (formerly Lakeview dashboards). Earlier versions of dashboards, previously referred to as Databricks SQL dashboards are now called legacy dashboards. Databricks does not recommend creating new legacy dashboards. AI/BI dashboards offer improved features compared to the legacy version, including AI-assisted authoring, draft and published modes, and cross-filtering.

End of support timeline for legacy dashboards

  • April 7, 2025: Official support for the legacy version of dashboards will end. Only critical security issues and service outages will be addressed.

  • November 3, 2025: Databricks will begin archiving legacy dashboards that have not been accessed in the past six months. Archived dashboards will no longer be accessible, and the archival process will occur on a rolling basis. Access to actively used dashboards will remain unchanged.

Databricks will work with customers to develop migration plans for active legacy dashboards after November 3, 2025.

To help transition to AI/BI dashboards, upgrade tools are available in both the user interface and the API. For instructions on how to use the built-in migration tool in the UI, see Clone a legacy dashboard to an AI/BI dashboard. For tutorials about creating and managing dashboards using the REST API at Use Databricks APIs to manage dashboards.

Changes to serverless compute workload attribution

Currently, your billable usage system table might include serverless SKU billing records with null values for run_as, job_id, job_run_id, and notebook_id. These records represent costs associated with shared resources that are not directly attributable to any particular workload.

To help simplify cost reporting, Databricks will soon attribute these shared costs to the specific workloads that incurred them. You will no longer see billing records with null values in workload identifier fields. As you increase your usage of serverless compute and add more workloads, the proportion of these shared costs on your bill will decrease as they are shared across more workloads.

For more information on monitoring serverless compute costs, see Monitor the cost of serverless compute.

Unity Catalog will soon drop support for storage credentials that use non-self-assuming IAM roles

Starting on September 20, 2024, Databricks will require that AWS IAM roles for new storage credentials be self-assuming. On January 20, 2025, Databricks will enforce this requirement on all existing storage credentials. Storage credentials that violate this requirement will cease to work, which might cause dependent workloads and jobs to fail. To learn more about this requirement and how to check and update your storage credentials, see Self-assuming role enforcement policy.

The sourceIpAddress field in audit logs will no longer include a port number

Due to a bug, certain authorization and authentication audit logs include a port number in addition to the IP in the sourceIPAddress field (for example, "sourceIPAddress":"10.2.91.100:0"). The port number, which is logged as 0, does not provide any real value and is inconsistent with the rest of the Databricks audit logs. To enhance the consistency of audit logs, Databricks plans to change the format of the IP address for these audit log events. This change will gradually roll out starting in early August 2024.

If the audit log contains a sourceIpAddress of 0.0.0.0, Databricks might stop logging it.

Legacy Git integration is EOL on January 31

After January 31, 2024, Databricks will remove legacy notebook Git integrations. This feature has been in legacy status for more than two years, and a deprecation notice has been displayed in the product UI since November 2023.

For details on migrating to Databricks Git folders (formerly Repos) from legacy Git integration, see Switching to Databricks Repos from Legacy Git integration. If this removal impacts you and you need an extension, contact your Databricks account team.

External support ticket submission will soon be deprecated

Databricks is transitioning the support ticket submission experience from help.databricks.com to the help menu in the Databricks workspace. Support ticket submission via help.databricks.com will soon be deprecated. You’ll continue to view and triage your tickets at help.databricks.com.

The in-product experience, which is available if your organization has a Databricks Support contract, integrates with Databricks Assistant to help address your issues quickly without having to submit a ticket.

To access the in-product experience, click your user icon in the top bar of the workspace, and then click Contact Support or type “I need help” into the assistant.

The Contact support modal opens.

Contact support modal

If the in-product experience is down, send requests for support with detailed information about your issue to help@databricks.com. For more information, see Get help.

JDK8 and JDK11 will be unsupported

Databricks plans to remove JDK 8 support with the next major Databricks Runtime version, when Spark 4.0 releases. Databricks plans to remove JDK 11 support with the next LTS version of Databricks Runtime 14.x.

Automatic enablement of Unity Catalog for new workspaces

Databricks has begun to enable Unity Catalog automatically for new workspaces. This removes the need for account admins to configure Unity Catalog after a workspace is created. Rollout is proceeding gradually across accounts.

sqlite-jdbc upgrade

Databricks Runtime plans to upgrade the sqlite-jdbc version from 3.8.11.2 to 3.42.0.0 in all Databricks Runtime maintenance releases. The APIs of version 3.42.0.0 are not fully compatible with 3.8.11.2. Confirm your methods and return type use version 3.42.0.0.

If you are using sqlite-jdbc in your code, check the sqlite-jdbc compatibility report.