Databricks SQL Release notes 2021
The following outlines the improvements and updates in Databricks SQL from January through December 2021.
December 15, 2021
Databricks SQL is Generally Available. This marks a major milestone in providing you with the first lakehouse platform that unifies data, AI, and BI workloads in one place. With GA, you can expect the highest level of stability, support, and enterprise-readiness from Databricks for mission-critical workloads. Read the GA announcement blog to learn more.
Alerts are now scheduled independently of queries. When you create a new alert and create a query, you are prompted to also create a schedule for the alert. If you had an existing alert, we’ve duplicated the schedule from the original query. This change also allows you to set alerts for both Run as Owner and Run as Viewer queries. Run as Owner queries run on the designated alert schedule with the query owner’s credential. Run as Viewer queries run on the designated alert schedule with the alert creator’s credential. See What are Databricks SQL alerts? and Schedule a query.
You can now re-order parameters in both the SQL editor and in dashboards.
The documentation for creating heatmap visualizations has been expanded. See Heatmap options.
December 9, 2021
When you create a table visualization, you can now set the font color for a column to a static value or a range of values based on the column’s field’s values. The literal value is compared to the threshold. For example, to colorize results whose values exceed
500000
, create the threshold> 500000
, rather than> 500,000
. See Conditionally format column colors.Icons in the tabbed SQL editor schema browser now allow you to distinguish between tables and views.
December 1,2021
You can now apply SQL configuration parameters at the workspace level. Those parameters automatically apply to all existing and new SQL endpoints in the workspace. See Configure SQL parameters.
November 18, 2021
You can now open the SQL editor by using a sidebar shortcut. To open the SQL editor, click SQL Editor.
If you have permission to create Data Science & Engineering clusters, you can now create SQL endpoints by clicking Create in the sidebar and clicking SQL Endpoint.
Administrators can now transfer ownership of a query, dashboard, or alert to a different user via the UI. See:
November 4, 2021
In a Map (Choropleth) visualization visualization, the maximum number of gradient steps for colors in the legend has been increased from 11 to 20. The default is 5 gradient steps inclusive of Min color and Max color.
The tabbed SQL editor now supports bulk tab management. If you right-click on a tab, you’ll see the option to Close others, Close left, Close right, and Close all. Note that if you right-click on the first or last tab, you won’t see the options to Close left or Close right.
October 28, 2021
When you view a table in Catalog Explorer, you have two options to simplify interacting with the table:
Click Create > Query to create a query that selects all columns and returns the first 1000 rows.
Click Create > Quick Dashboard to open a configuration page where you can select columns of interest and create a dashboard and supporting queries that provide some basic information using those columns and showcase dashboard-level parameters and other capabilities.
October 19, 2021
New keyboard shortcuts are now available in the tabbed editor:
Open new tab:
Windows:
Cmd+Alt+T
Mac:
Cmd+Option+T
Close current tab
Windows:
Cmd+Alt+W
Mac:
Cmd+Option+W
Open query dialog
Windows:
Cmd+Alt+O
Mac:
Cmd+Option+O
September 23, 2021
You can now create a new dashboard by cloning an existing dashboard, as long as you have the CAN RUN, CAN EDIT and CAN MANAGE permission on the dashboard and all upstream queries. See Clone a legacy dashboard.
You can now use
GROUP BY
in a visualization with multiple Y-axis columns. See Scatter chart.You can now use
{{ @@yPercent}}
to format data labels in an unnormalized stacked bar chart. See Bar chart.If you use SAML authentication and your SAML credential will expire within a few minutes, you are now proactively prompted to log in again before executing a query or refreshing a dashboard. This helps to prevent disruption due to a credential that expires during query execution.
September 20, 2021
You can now transfer ownership of dashboards, queries, and alerts using the Permissions REST API. See Query ACLs.
September 16, 2021
In query results,
BIGINT
results are now serialized as strings when greater than 9007199254740991. This fixes a problem whereBIGINT
results could be truncated in query results. Other integer results are still serialized as numbers. Number formatting on axis labels and tooltips does not apply toBIGINT
results that are serialized as strings. For more information about data types in Databricks SQL, see BIGINT type.
September 7, 2021
Databricks is rolling out the changes that follow over the course of a week. Your workspace may not be enabled for these changes until after September 7.
Databricks SQL is now in Public Preview and enabled for all users in new workspaces.
Note
If your workspace was enabled for Databricks SQL during the Public Preview—that is, before the week beginning September 7, 2021—users retain the entitlement assigned before that date, unless you change it. In other words, if a user did not have access to Databricks SQL during the Public Preview, they will not have it now unless an administrator gives it to them.
Administrators can manage which users have access to Databricks SQL by assigning the Databricks SQL access entitlement (
databricks-sql-access
in the API) to users or groups. By default, new users have this entitlement.
Administrators can limit a user or group to accessing only Databricks SQL and prevent them from accessing Data Science & Engineering or Databricks Mosaic AI by removing the Workspace Access entitlement (workspace-access
in the API) from the user or group. By default, new users have this entitlement.
Important
To log in and access Databricks, a user must have either the Databricks SQL access or Workspace access entitlement (or both).
A small classic SQL endpoint called Starter Endpoint is pre-configured on all workspaces, so you can get started creating dashboards, visualizations, and queries right away. To handle more complex workloads, you can easily increase its size (to reduce latency) or the number of underlying clusters (to handle more concurrent users). To manage costs, the starter endpoint is configured to terminate after 120 minutes idle.
If serverless compute is enabled for your workspace and you enable Serverless SQL endpoints, a Serverless SQL endpoint called Serverless Starter Endpoint is automatically created, and you can use it for dashboards, visualizations, and queries. Serverless SQL endpoints start more quickly than classic SQL endpoints and automatically terminate after 10 minutes idle.
To help you get up and running quickly, a new guided onboarding experience is available for administrators and users. The onboarding panel is visible by default, and you can always see how many onboarding tasks are left in the sidebar. Click tasks left to reopen the onboarding panel.
You can get started using Databricks SQL quickly with two rich datasets in a read-only catalog called
SAMPLES
, which is available from all workspaces. When you learn about Databricks SQL, you can use these schemas to create queries, visualizations, and dashboards. No configuration is required, and all users have access to these schemas.The
nyctaxi
schema contains taxi trip data in thetrips
table.The
tpch
schema contains retail revenue and supply chain data in the following tables:customer
lineitem
nation
orders
part
partsupp
region
supplier
Click Run your first query in the onboarding panel to generate a new query of the
nyctaxi
schema.To learn about visualizing data in Databricks SQL with no configuration required, you can import dashboards from the Dashboard Samples Gallery. These dashboards are powered by the datasets in the
SAMPLES
catalog.To view the Dashboard Samples Gallery, click Import sample dashboard in the onboarding panel.
You can now create and drop native SQL functions using the CREATE FUNCTION and DROP FUNCTION commands.
September 2, 2021
Users with the CAN EDIT permission on a dashboard can now manage the dashboard’s refresh schedule and subscription list. Previously, the CAN MANAGE permission was required. For more information, see Automatically refresh a dashboard.
You can now temporarily pause scheduled export to dashboard subscribers without modifying the schedule. Previously, you had to remove all subscribers, disable the schedule, and then recreate. For more information, see Temporarily pause scheduled dashboard updates.
By default, visualizations no longer dynamically resize based on the number of results returned, but maintain the same height regardless of the number of results. To return to the previous behavior and configure a visualization to dynamically resize, enable Dynamically resize panel height in the visualization’s settings in the dashboard. For more information, see Table options.
If you have access to more than one workspace in the same account, you can switch workspaces from within Databricks SQL. Click in the lower left corner of your Databricks workspace, then select a workspace to switch to it.
August 30, 2021
Serverless SQL endpoints provide instant compute, minimal management, and cost optimization for SQL queries.
Until now, computation for SQL endpoints happened in the compute plane in your AWS account. The initial release of serverless compute adds Serverless SQL endpoints to Databricks SQL, moving those compute resources to your Databricks account.
You use serverless SQL warehouses with Databricks SQL queries just like you use the SQL endpoints that live in your own AWS account, now called Classic SQL endpoints. But serverless SQL warehouses typically start with low latency compared to Classic SQL endpoints, are easier to manage, and are optimized for cost.
Before you can create serverless SQL warehouses, an admin must enable the Serverless SQL endpoints option for your workspace. Once enabled, new SQL endpoints are Serverless by default, but you can continue to create SQL endpoints as Serverless or Classic as you like.
For details about the Serverless compute architecture and comparisons with the classic compute plane, see Serverless compute plane. For details about configuring serverless SQL warehouses—including how to convert Classic SQL endpoints to Serverless—see Enable serverless SQL warehouses.
For the list of supported regions for serverless SQL warehouses, see Databricks clouds and regions.
.. important:: Serverless Compute is subject to applicable terms that must be accepted by an account owner or account admin in order to enable the feature.
August 12, 2021
You can now send a scheduled dashboard update to email addresses that are not associated with Databricks accounts. When viewing a dashboard, click Scheduled to view or update the list of subscribed email addresses. If an email address is not associated with a Databricks account, it must be configured as a notification destination. For more information, see Automatically refresh a dashboard.
An administrator can now terminate another user’s query while it is executing. For more information, see Terminate an executing query.
August 05, 2021
To reduce latency on SQL endpoints when your workspace uses AWS Glue Data Catalog as the external metastore, you can now configure client-side caching. For more information, see Higher latency with Glue Catalog than Databricks Hive metastore and Configure data access properties for SQL warehouses.
Improved
EXPLAIN
result formattingExplain results are easier to read
Formatted as monospaced with no line wrap
July 29, 2021
Juggling multiple queries just got easier with support for multiple tabs in the query editor. To use the tabbed editor, see Edit multiple queries.
July 08, 2021
Visualization widgets in dashboards now have titles and descriptions so that you can tailor the title and description of visualizations used in multiple dashboards to the dashboard itself.
The sidebar has been updated for improved visibility and navigation:
Warehouses are now SQL Endpoints and History is renamed to Query History.
Account settings (formerly named Users) have been moved to Account. When you select Account you can change the Databricks workspace and log out.
User settings have been moved to Settings and have been split into Settings and SQL Admin Console. SQL Admin Console is visible only to admins.
The help icon changed to Help.
July 01, 2021
The new Catalog Explorer allows you to easily explore and manage permissions on databases and tables. Users can view schema details, preview sample data, and see table details and properties. Administrators can view and change data object owners, and data object owners can grant and revoke permissions. For details, see What is Catalog Explorer?.
Y-axes in horizontal charts have been updated to reflect the same ordering as in tables. If you have previously selected reverse ordering, you can use the Reverse Order toggle on the Y-axis tab to reverse the new ordering.
June 17, 2021
Photon, Databricks’ new vectorized execution engine, is now on by default for newly created SQL endpoints (both UI and REST API). Photon transparently speeds up
Writes to Parquet and Delta tables.
Many SQL queries. See Limitations.
Easily manage users and groups with
CREATE GROUP
,DROP GROUP
,ALTER GROUP
,SHOW GROUPS
, andSHOW USERS
commands. For details, see Security statements and Show statements.The query editor schema browser is snappier and faster on schemas with more than 100 tables. On such schemas, the schema browser will not load all columns automatically; the list of tables still shows as usual, but columns load only when you click a table. This change affects query autocomplete in the query editor, because it depends on this information to show suggestions. Until you expand a table and load its columns, those suggestions are not available.
June 03, 2021
Admins of newly enabled Databricks workspaces now receive the Databricks SQL entitlement by default and are no longer required to give themselves the Databricks SQL access entitlement using the admin console.
Photon is now in public preview and enabled by default for new SQL endpoints.
Multi-cluster load balancing is now in public preview.
You can now enable collaboration on dashboards and queries with other members of your organization using CAN EDIT permission. See Access control lists.
May 26, 2021
SQL Analytics is renamed to Databricks SQL. This change has the following customer-facing impacts:
References in the web UI have been updated.
The entitlement to grant access to Databricks SQL has been renamed:
UI: Databricks SQL access (previously SQL Analytics access)
SCIM API:
databricks-sql-access
(previouslysql-analytics-access
)
Users, groups, and service principals with the previous entitlement have been migrated to the new entitlement.
Tags for audit log events related to Databricks SQL have changed:
The prefix for Databricks SQL events is now
databrickssql
.changeSqlAnalyticsAcl
is nowchangeDatabricksSqlAcl
.
Dashboard updates
The dashboard export filename has been updated to be the name of the dashboard + timestamp, rather than a UUID.
Export records limit has been raised from 22k to 64k.
Dashboard authors now have the ability to periodically export and email dashboard snapshots. Dashboard snapshots are taken from the default dashboard state, meaning that any interaction with the visualizations will not be present in the snapshot.
If you are the owner of a dashboard, you can create a refresh schedule and subscribe other users, who’ll get email snapshots of the dashboard every time it’s refreshed.
If you have view permission for a dashboard, you can subscribe to existing refresh schedules.
Predicate pushdown expressions (
StartsWith
,EndsWith
,Contains
,Not(EqualTo())
, andDataType
) are disabled for AWS Glue Catalog since they are not supported.
May 13, 2021
Databricks SQL no longer tries to guess column types. Previously, a column with the format
xxxx-yy-dd
was automatically treated as a date, even if it was an identification code. Now that column is no longer automatically treated as a date. You must specify that in the query if so desired. This change may cause some visualizations that relied on the previous behavior to no longer work. In this release, you can change > Settings > Backwards Compatibility option to return to the previous behavior. In a future release we will remove that capability.The query editor now has a query progress indicator. State changes are now visible in a continually updated progress bar.
Fixed issues
SQL editor. The SQL editor will now persist selected text and scroll position when switching between query tabs.
SQL editor. If you click ‘Run’ on a query in the SQL editor, then navigate to another page and return while the query is still executing, the editor will display the correct query state. If the query completes while you are on another page, query results will be available on return to the SQL editor page.
You can now use MySQL 8.0 as an external metastore.
DESCRIBE DETAIL
commands on Delta tables no longer fail withjava.lang.ClassCastException: java.sql.Timestamp cannot be cast to java.time.Instant.
Reading Parquet files with
INT96
timestamps no longer fails.When a user has CAN RUN permission on a query and runs it, if the query was created by another user, the query history displays the runner of the query as the user.
Null values are now ignored when rendering a chart, improving the usability of charts. For example, previously, bars in a bar chart would look very small when null values were present. Now the axes are set based on non-null values only.