Databricks SQL release notes

This article lists new Databricks SQL features and improvements, along with known issues and FAQs.

Release process

Databricks releases updates to the Databricks SQL web application user interface on an ongoing basis, with all users getting the same updates, rolled out over a short period of time.

In addition, Databricks typically releases new SQL endpoint compute versions regularly. Two channels are always available: Preview and Current.

Note

Releases are staged. Your Databricks account may not be updated with a new SQL endpoint version or Databricks SQL feature until a week or more after the initial release date.

Channels

Channels let you choose whether to use the Current SQL endpoint compute version or the Preview version. Preview versions let you try out functionality before it becomes the Databricks SQL standard. Take advantage of preview versions to test your production queries and dashboards against upcoming changes.

Typically, a preview version is promoted to the current channel approximately two weeks after being released to the preview channel. Some features, such as security features, maintenance updates, and bug fixes, may be released directly to the current channel. From time to time, Databricks may promote a preview version to the current channel on a different schedule. Each new version will be announced in the following sections.

To learn how to switch an existing SQL endpoint to the preview channel, see _.

Current

Version 2022.17: May 4, 2022

  • The following SQL functions are now available with this release:

  • The following Spark SQL functions are now available with this release:

    • `try_multiply`: Returns multiplier multiplied by multiplicand, or NULL on overflow.

    • `try_subtract`: Returns the subtraction of expr2 from expr1, or NULL on overflow.

  • SQL UDF supports the definition of DEFAULTs for their parameters.

  • You can now create Delta tables by ingesting small CSV files (up to 100mb) with the Create Table UI. This new UI supports CSV file upload and data preview with inferred schema. It allows you to edit column names, data types, and commonly used format options before creating the table. In this UI, you can specify a destination path (catalog and schema) for your new table.

Preview

Version 2022.20: May 10-16, 2022

Web application user interface updates

The features listed in this section are independent of the SQL Endpoint compute versions described in the Channels section of the release notes.

May 12, 2022

  • Visualizations now support time binning directly in the UI. You can now easily switch between yearly, monthly, daily, or hourly bins of your data by changing a dropdown value rather than adding and modifying a date_trunc() function in the query text itself.

  • Dashboards now have color consistency by default. If you have the same series across multiple charts, the series is always colored the same across all charts – without requiring any manual configuration.

May 3, 2022

  • When sharing a dashboard with a user or group, we now also provide the ability to share all upstream queries used by visualizations and parameters.

    • When you do not have permission to share one or more of the upstream queries, you will receive a warning message that not all queries could be shared.

    • The permissions granted when sharing a dashboard do not override, negate, or expand upon existing permissions on the upstream queries. For example, if a user or group has Can Run as Owner permissions on the shared dashboard but only has Run as Viewer permissions on an upstream query, the effective permissions on that upstream query will be Run as Viewer.

April 27, 2022

  • Your dashboard layout is now retained when exporting to PDF on demand and generating scheduled subscription emails.

April 25, 2022

  • Databricks SQL is now available in Public Preview on Google Cloud Platform.

March 17, 2022

  • Charts includes a new combination visualization option. This allows you to create charts that include both bars and lines.

March 10, 2022

  • Unity Catalog (Preview) allows you to manage governance and access to your data at the level of the account. You can manage metastores and data permissions centrally, and you can assign a metastore to multiple workspaces in your account. You can manage and interact with Unity Catalog data and objects using the Databricks SQL Data Explorer or the SQL editor, and you can use Unity Catalog data in dashboards and visualizations. See Unity Catalog (Preview).

    Note

    Unity Catalog requires SQL endpoints to use version 2022.11, which is in the Preview channel.

  • Delta Sharing (Preview) allows you to share read-only data with recipients outside your organization. Databricks SQL supports querying Delta Sharing data and using it in visualizations and dashboards.

    Delta Sharing is subject to additional Service Specific Terms that must be accepted by a workspace admin in order to enable the feature.

    See Share data using Delta Sharing (Preview).

  • Each time a dashboard is refreshed manually or on a schedule, all queries in the dashboard and upstream, including those used by parameters, are refreshed. When an individual visualization is refreshed, all upstream queries, including those used by parameters, are refreshed.

March 3, 2022

  • The cohort visualization has been updated such that cohorts are interpolated from min and max values rather than 0 and 100. It’s now much easier to distinguish cohorts within the actual range of data available. Previously, if all numbers were close together, they used the same color. Now, numbers that are close together are more likely to use different colors because the cohort is divided from the max to min range to form each series.

  • It’s easier to see whether a dashboard subscription schedule is active or paused. When you click Subscribe, if the dashboard subscription schedule is paused, the message This schedule has been paused appears. When a dashboard subscription schedule is paused, you can subscribe or unsubscribe from the dashboard, but scheduled snapshots are not sent and the dashboard’s visualizations are not updated.

  • When you view Query History, you can now sort the list by duration. By default, queries are sorted by start time.

February 24, 2022

  • In Data Explorer, you can now view the permissions users or groups have on a table, view, schema, or catalog. Click the object, then click Permissions and use the new filter box.

February 17, 2022

  • Visualizations just became a little smarter! When a query results in one or two columns, a recommended visualization type is automatically selected.

  • You can now create Histogram visualization to visualize the frequency that each value occurs within a dataset and to understand whether a dataset has values that are clustered around a small number of ranges or are more spread out.

  • In both Query History and Query Profile, you can now expand to full width the query string and the error message of a failed query. This makes it easier to analyse query plans and to troubleshoot failed queries.

  • In bar, line, area, pie, and heatmap visualizations, you can now perform aggregation directly in the visualization configuration UI, without the need to modify the query itself. When leveraging these new capabilities, the aggregation is performed over the entire data set, rather than being limited to the first 64,000 rows. When editing a visualization created prior to this release, you will see a message that says This visualization uses an old configuration. New visualizations support aggregating data directly within the editor. If you want to leverage the new capabilities, you must re-create the visualization. See Enable aggregation in a visualization.

February 10, 2022

  • You can now set a custom color palette for a dashboard. All visualizations that appear in that dashboard will use the specified palette. Setting a custom palette does not affect how a visualization appears in other dashboards or the SQL editor.

    You can specify hex values for a palette or import colors from another palette, whether provided by Databricks or created by a workspace admin.

    When a palette is applied to a dashboard, all visualizations displayed in that dashboard will use the selected color palette by default, even if you configure custom colors when you create the visualization. To override this behavior, see Customize colors for a visualization.

  • Workspace admins can now create a custom color palette using the SQL admin console. After the custom color palette is created, it can be used in new and existing dashboards. To use a custom color palette for a dashboard or to customize it, you can edit dashboard settings.

  • When you add a visualization that uses parameters to a dashboard from the SQL Vertical Ellipsis menu, the visualization now uses dashboard-level parameters by default. This matches the behavior when you add a widget using the Add Visualization button in a dashboard.

  • When you view the query history and filter the list by a combination of parameters, the number of matching queries is now displayed.

  • In visualizations, an issue was fixed where the Y-axis range could not be adjusted to specific values.

February 3, 2022

  • The tabbed SQL editor is now enabled by default for all users. For more information or to disable the tabbed editor, see Edit multiple queries.

January 27, 2022

  • Improvements have been made to how you can view, share, and import a query’s profile. See Query profile.

  • The Details visualization now allows you to rename columns just like the Table visualization.

  • You can now close a tab in the SQL editor by middle-clicking it.

  • The following Keyboard shortcuts have been added to the tabbed SQL editor:

    • Close all tabs: Cmd+Option+Shift+A (macOS) / Ctrl+Option+Shift+A (Windows)

    • Close other tabs: Cmd+Option+Shift+W (macOS) / Ctrl+Option+Shift+W (Windows)

    These keyboard shortcuts provide an alternative to right-clicking on a tab to access the same actions. To view all keyboard shortcuts, click the Keyboard Icon Keyboard icon in the tabbed SQL editor.

January 20, 2022

  • The default formatting for integer and float data types in tables has been updated to not include commas. This means that by default, values like 10002343 will no longer have commas. To format these types to display with commas, click Edit Visualization, expand the area for the column, and modify the format to include a comma.

  • To better align with browser rendering limits, visualizations now display a maximum of 10,000 data points. For example, a scatterplot will display a maximum of 10,000 dots. If the number of data points has been limited, a warning is displayed.

January 13, 2022

  • We fixed an issue where the Save button in the SQL editor was sometimes disabled. The Save button is is now always enabled, and includes an asterisk (*) when unsaved changes are detected.

December 15, 2021

  • Databricks SQL is Generally Available. This marks a major milestone in providing you with the first Lakehouse Platform that unifies data, AI, and BI workloads in one place. With GA, you can expect the highest level of stability, support, and enterprise-readiness from Databricks for mission-critical workloads. Read the GA announcement blog to learn more.

  • Alerts are now scheduled independently of queries. When you create a new alert and create a query, you are prompted to also create a schedule for the alert. If you had an existing alert, we’ve duplicated the schedule from the original query. This change also allows you to set alerts for both Run as Owner and Run as Viewer queries. Run as Owner queries run on the designated alert schedule with the query owner’s credential. Run as Viewer queries run on the designated alert schedule with the alert creator’s credential. See Alerts and Schedule a query.

  • You can now re-order parameters in both the SQL editor and in dashboards.

  • The documentation for creating heatmap visualizations has been expanded. See Heatmap visualization.

December 9, 2021

  • When you create a table visualization, you can now set the font color for a column to a static value or a range of values based on the column’s field’s values. The literal value is compared to the threshold. For example, to colorize results whose values exceed 500000, create the threshold > 500000, rather than > 500,000. See Conditionally format column colors.

  • Icons in the tabbed SQL editor schema browser now allow you to distinguish between tables and views.

December 1,2021

  • You can now apply SQL configuration parameters at the workspace level. Those parameters automatically apply to all existing and new SQL endpoints in the workspace. See SQL configuration parameters.

November 18, 2021

  • You can now open the SQL editor by using a sidebar shortcut. To open the SQL editor, click SQL Editor.

  • If you have permission to create Data Science & Engineering clusters, you can now create SQL endpoints by clicking Create in the sidebar and clicking SQL endpoint.

  • Administrators can now transfer ownership of a query, dashboard, or alert to a different user via the UI. See:

November 4, 2021

  • In a Map (Chloropleth) visualization visualization, the maximum number of gradient steps for colors in the legend has been increased from 11 to 20. The default is 5 gradient steps inclusive of Min color and Max color.

  • The tabbed SQL editor now supports bulk tab management. If you right-click on a tab, you’ll see the option to Close others, Close left, Close right, and Close all. Note that if you right-click on the first or last tab, you won’t see the options to Close left or Close right.

October 28, 2021

  • When you view a table in Data Explorer, you have two options to simplify interacting with the table:

    • Click Create > Query to create a query that selects all columns and returns the first 1000 rows. See Create a basic query.

    • Click Create > Quick Dashboard to open a configuration page where you can select columns of interest and create a dashboard and supporting queries that provide some basic information using those columns and showcase dashboard-level parameters and other capabilities. See Create a quick dashboard.

October 19, 2021

  • New keyboard shortcuts are now available in the tabbed editor:

    • Open new tab:

      • Windows: Cmd+Alt+T

      • Mac: Cmd+Option+T

    • Close current tab

      • Windows: Cmd+Alt+W

      • Mac: Cmd+Option+W

    • Open query dialog

      • Windows: Cmd+Alt+O

      • Mac: Cmd+Option+O

September 23, 2021

  • You can now create a new dashboard by cloning an existing dashboard, as long as you have the Can Run, Can Edit and Can Manage permission on the dashboard and all upstream queries. See Clone a dashboard.

  • You can now use GROUP BY in a visualization with multiple Y-axis columns. See Grouping.

  • You can now use {{ @@yPercent}} to format data labels in an unnormalized stacked bar chart. See Stacking.

  • If you use SAML authentication and your SAML credential will expire within a few minutes, you are now proactively prompted to log in again before executing a query or refreshing a dashboard. This helps to prevent disruption due to a credential that expires during query execution.

September 20, 2021

September 16, 2021

  • In query results, BIGINT results are now serialized as strings when greater than 9007199254740991. This fixes a problem where BIGINT results could be truncated in query results. Other integer results are still serialized as numbers. Number formatting on axis labels and tooltips does not apply to BIGINT results that are serialized as strings. For more information about data types in Databricks SQL, see BIGINT type (Databricks SQL).

September 7, 2021

Databricks is rolling out the changes that follow over the course of a week. Your workspace may not be enabled for these changes until after September 7.

  • Databricks SQL is now in Public Preview and enabled for all users in new workspaces.

    Note

    If your workspace was enabled for Databricks SQL during the Public Preview—that is, before the week beginning September 7, 2021—users retain the entitlement assigned before that date, unless you change it. In other words, if a user did not have access to Databricks SQL during the Public Preview, they will not have it now unless an administrator gives it to them.

  • Administrators can manage which users have access to Databricks SQL by assigning the Databricks SQL access entitlement (databricks-sql-access in the API) to users or groups. By default, new users have this entitlement.

    Administrators can limit a user or group to accessing only Databricks SQL and prevent them from accessing Data Science & Engineering or Databricks Machine Learning by removing the Workspace Access entitlement (workspace-access in the API) from the user or group. By default, new users have this entitlement.

    Important

    To log in and access Databricks, a user must have either the Databricks SQL access or Workspace access entitlement (or both).

    For more information, see Manage users and groups.

  • A small classic SQL endpoint called Starter Endpoint is pre-configured on all workspaces, so you can get started creating dashboards, visualizations, and queries right away. To handle more complex workloads, you can easily increase its size (to reduce latency) or the number of underlying clusters (to handle more concurrent users). To manage costs, the starter endpoint is configured to terminate after 120 minutes idle.

  • If Serverless compute (Private Preview) is enabled for your workspace and you enable Serverless SQL endpoints, a Serverless SQL endpoint called Serverless Starter Endpoint is automatically created, and you can use it for dashboards, visualizations, and queries. Serverless SQL endpoints start more quickly than classic SQL endpoints and automatically terminate after 10 minutes idle.

  • To help you get up and running quickly, a new guided onboarding experience is available for administrators and users. The onboarding panel is visible by default, and you can always see how many onboarding tasks are left in the sidebar, above Help icon. Click tasks left to reopen the onboarding panel.

  • You can get started using Databricks SQL quickly with two rich datasets in a read-only catalog called SAMPLES, which is available from all workspaces. When you learn about Databricks SQL, you can use these databases to create queries, visualizations, and dashboards. No configuration is required, and all users have access to these databases.

    • The nyctaxi database contains taxi trip data in the trips table.

    • The tpch database contains retail revenue and supply chain data in the following tables:

      • customer

      • lineitem

      • nation

      • orders

      • part

      • partsupp

      • region

      • supplier

    Click Run your first query in the onboarding panel to generate a new query of the nyctaxi database.

  • To learn about visualizing data in Databricks SQL with no configuration required, you can import dashboards from the Sample Dashboard Gallery. These dashboards are powered by the datasets in the SAMPLES catalog.

    To view the Sample Dashboard Gallery, click Import sample dashboard in the onboarding panel.

  • You can now create and drop native SQL functions using the CREATE FUNCTION and DROP FUNCTION commands.

September 2, 2021

  • Users with the Can Edit permission on a dashboard can now manage the dashboard’s refresh schedule and subscription list. Previously, the Can Manage permission was required. For more information, see Automatically refresh a dashboard.

  • You can now temporarily pause scheduled export to dashboard subscribers without modifying the schedule. Previously, you had to remove all subscribers, disable the schedule, and then recreate. For more information, see Temporarily pause scheduled dashboard updates.

  • By default, visualizations no longer dynamically resize based on the number of results returned, but maintain the same height regardless of the number of results. To return to the previous behavior and configure a visualization to dynamically resize, enable Dynamically resize panel height in the visualization’s settings in the dashboard. For more information, see Tables.

  • If you have access to more than one workspace in the same account, you can switch workspaces from within Databricks SQL. Click Account Icon in the lower left corner of your Databricks workspace, then select a workspace to switch to it.

August 30, 2021

  • Serverless SQL endpoints provide instant compute, minimal management, and cost optimization for SQL queries.

    Until now, computation for SQL endpoints happened in the data plane in your AWS account. The initial release of Serverless compute adds Serverless SQL endpoints to Databricks SQL, moving those compute resources to the Databricks cloud account in a shared service.

    You use Serverless SQL endpoints with Databricks SQL queries just like you use the SQL endpoints that live in your own AWS account, now called Classic SQL endpoints. But Serverless SQL endpoints typically start with low latency compared to Classic SQL endpoints, are easier to manage, and are optimized for cost.

    Before you can create Serverless SQL endpoints, an admin must enable the Serverless SQL endpoints option for your workspace. Once enabled, new SQL endpoints are Serverless by default, but you can continue to create SQL endpoints as Serverless or Classic as you like.

    For details about the Serverless compute architecture and comparisons with the Classic data plane, see Serverless compute. For details about configuring Serverless SQL endpoints—including how to convert Classic SQL endpoints to Serverless—see Enable Serverless SQL endpoints.

    For the list of supported regions for Serverless SQL endpoints, see Supported Databricks clouds and regions.

    Important

    Serverless Compute is subject to additional Service Specific Terms that must be accepted by a workspace admin in order to enable the feature.

August 12, 2021

  • You can now send a scheduled dashboard update to email addresses that are not associated with Databricks accounts. When viewing a dashboard, click Scheduled to view or update the list of subscribed email addresses. If an email address is not associated with a Databricks account, it must be configured as an alert destination. For more information, see Automatically refresh a dashboard.

August 05, 2021

  • Improved EXPLAIN result formatting

    • Explain results are easier to read

    • Formatted as monospaced with no line wrap

July 29, 2021

July 08, 2021

  • Visualization widgets in dashboards now have titles and descriptions so that you can tailor the title and description of visualizations used in multiple dashboards to the dashboard itself.

  • The sidebar has been updated for improved visibility and navigation:

    • Endpoints are now SQL Endpoints and History is renamed to Query History.

    • Account settings (formerly named Users) have been moved to SQL Account Icon Account. When you select Account you can change the Databricks workspace and log out.

    • User settings have been moved to User Settings Icon Settings and have been split into User Settings and SQL Admin Console. SQL Admin Console is visible only to admins.

    • The help icon changed to Help icon Help.

July 01, 2021

  • The new data explorer allows you to easily explore and manage permissions on databases and tables. Users can view schema details, preview sample data, and see table details and properties. Administrators can view and change data object owners, and data object owners can grant and revoke permissions. For details, see Data explorer.

  • Y-axes in horizontal charts have been updated to reflect the same ordering as in tables. If you have previously selected reverse ordering, you can use the Reverse Order toggle on the Y-axis tab to reverse the new ordering.

June 23, 2021

  • Temp views are now supported.

June 17, 2021

  • Photon, Databricks’ new vectorized execution engine, is now on by default for newly created SQL endpoints (both UI and REST API). Photon transparently speeds up

    • Writes to Parquet and Delta tables.

    • Many SQL queries. See Limitations.

  • Easily manage users and groups with CREATE GROUP, DROP GROUP, ALTER GROUP, SHOW GROUPS, and SHOW USERS commands. For details, see Security statements and Show statements.

  • The query editor schema browser is snappier and faster on databases with more than 100 tables. On such databases, the schema browser will not load all columns automatically; the list of tables still shows as usual, but columns load only when you click a table. This change affects query autocomplete in the query editor, because it depends on this information to show suggestions. Until you expand a table and load its columns, those suggestions are not available.

June 03, 2021

  • Admins of newly enabled Databricks workspaces now receive the Databricks SQL entitlement by default and are no longer required to give themselves the Databricks SQL access entitlement using the admin console.

  • Photon is now in public preview and enabled by default for new SQL endpoints.

  • Multi-cluster load balancing is now in public preview.

  • You can now enable collaboration on dashboards and queries with other members of your organization using Can Edit permission. See Dashboard access control and Query access control.

May 26, 2021

  • SQL Analytics is renamed to Databricks SQL. This change has the following customer-facing impacts:

    • References in the web UI have been updated.

    • The entitlement to grant access to Databricks SQL has been renamed:

      • UI: Databricks SQL access (previously SQL Analytics access)

      • SCIM API: databricks-sql-access (previously sql-analytics-access)

      Users, groups, and service principals with the previous entitlement have been migrated to the new entitlement.

    • Tags for audit log events related to Databricks SQL have changed:

      • The prefix for Databricks SQL events is now databrickssql.

      • changeSqlAnalyticsAcl is now changeDatabricksSqlAcl.

  • Dashboard updates

    • The dashboard export filename has been updated to be the name of the dashboard + timestamp, rather than a UUID.

    • Export records limit has been raised from 22k to 64k.

    • Dashboard authors now have the ability to periodically export and email dashboard snapshots. Dashboard snapshots are taken from the default dashboard state, meaning that any interaction with the visualizations will not be present in the snapshot.

      • If you are the owner of a dashboard, you can create a refresh schedule and subscribe other users, who’ll get email snapshots of the dashboard every time it’s refreshed.

      • If you have view permission for a dashboard, you can subscribe to existing refresh schedules.

      See Dashboard snapshot subscriptions.

  • Predicate pushdown expressions (StartsWith, EndsWith, Contains, Not(EqualTo()), and DataType) are disabled for AWS Glue Catalog since they are not supported.

May 20, 2021

  • You can now use your own key from AWS KMS to encrypt the Databricks SQL queries and query history stored in Databricks. If you’ve already configured your own key for a workspace to encrypt data for managed services (notebooks and secrets), then no further action is required. The same customer-managed key for managed services now also encrypts the Databricks SQL queries and query history. See Customer-managed keys for managed services. This change affects only new data that is stored at rest. Databricks SQL queries and query history that were stored before today are not guaranteed to be encrypted with this key.

    Databricks SQL query results are stored in your root S3 bucket that you provided during workspace setup, and they are not encrypted by your managed services key. However, you can use your own key to encrypt them. See Customer-managed keys for workspace storage.

    This feature is available with the Enterprise pricing plan.

  • The Past executions tab now shows relative time.

May 13, 2021

  • Databricks SQL no longer tries to guess column types. Previously, a column with the format xxxx-yy-dd was automatically treated as a date, even if it was an identification code. Now that column is no longer automatically treated as a date. You must specify that in the query if so desired. This change may cause some visualizations that relied on the previous behavior to no longer work. In this release, you can change User Settings Icon > Settings > Backwards Compatibility option to return to the previous behavior. In a future release we will remove that capability.

  • The query editor now has a query progress indicator. State changes are now visible in a continually updated progress bar.

    Progress bar

May 06, 2021

  • You can now download the contents of the dashboard as a PDF. See Download as PDF.

  • An admin user now has view access to all the queries and dashboards. In this view an admin can view and delete any query or dashboard. However, the admin can’t edit the query or dashboard if it is not shared with the admin. See Query admin view and Dashboard admin view.

  • The ability to increase endpoint concurrency with multi-cluster load balancing is now available for all accounts. You can create endpoints that autoscale between specified minimum and maximum cluster counts. Overloaded endpoints will scale up and underloaded endpoints will scale down.

April 29, 2021

  • Query options and details are now organized in a set of tabs to the left of the query editor:

April 22, 2021

  • Fixed an issue in which endpoints were inaccessible and appeared to be deleted due to internal error.

April 16, 2021

Databricks SQL maintains compatibility with Apache Spark SQL semantics. This release updates the semantics to match those of Apache Spark 3.1. Previously Databricks SQL was aligned to Apache Spark 3.0 semantics.

  • Statistical aggregation functions, including std, stddev, stddev_samp, variance, var_samp, skewness, kurtosis, covar_samp, and corr, return NULL instead of Double.NaN when DivideByZero occurs during expression evaluation, for example, when stddev_samp applied on a single element set. Prior to this release, it would return Double.NaN.

  • grouping_id() returns long values. Prior to this release, this function returned int values.

  • The query plan explain results is now formatted.

  • from_unixtime, unix_timestamp,to_unix_timestamp, to_timestamp, and to_date will fail if the specified datetime pattern is invalid. Prior to this release, they returned NULL.

  • The Parquet, ORC, Avro, and JSON data sources throw the exception org.apache.spark.sql.AnalysisException: “Found duplicate column(s) in the data schema in read if they detect duplicate names in top-level columns as well in nested structures.”

  • Structs and maps are wrapped by the {} brackets in casting them to strings. For instance, the show() action and the CAST expression use such brackets.Prior to this release, the d brackets were used for the same purpose.

  • NULL elements of structures, arrays and maps are converted to “null” in casting them to strings. Prior to this release, NULL elements were converted to empty strings.

  • The sum of decimal type column overflows returns null. Prior to this release, in the case, the sum of decimal type column may return null or incorrect result, or even fails at runtime (depending on the actual query plan execution).

  • IllegalArgumentException is returned for the incomplete interval literals, for example, INTERVAL '1', INTERVAL '1 DAY 2', which are invalid. Prior to this release, these literals result in NULLs.

  • Loading and saving of timestamps from and to Parquet files fails if the timestamps are before 1900-01-01 00:00:00Z, and loaded (saved) as the INT96 type. Prior to this release, the actions don’t fail but might lead to shifting of the input timestamps due to rebasing from/to Julian to/from Proleptic Gregorian calendar.

  • The schema_of_json and schema_of_csv functions return the schema in the SQL format in which field names are quoted. Prior to this release, the function returns a catalog string without field quoting and in lower case.

  • CHAR,CHARACTER, and VARCHAR types are supported in table schema. Table scan and insertion respects the char/varchar semantic. If char/varchar is used in places other than table schema, an exception is thrown (CAST is an exception that simply treats char/varchar as string like before).

  • The following exceptions are thrown for tables from Hive external catalog:

    • ALTER TABLE .. ADD PARTITION throws PartitionsAlreadyExistException if new partition exists already.

    • ALTER TABLE .. DROP PARTITION throws NoSuchPartitionsException for not existing partitions.

April 13, 2021

  • Improved query throughput with SQL endpoint queuing. Queries submitted to a SQL endpoint now queue when the endpoint is already saturated with running queries. This improves query throughput by not overloading the endpoint with requests. You can view the improved performance in the endpoint monitoring screen.

April 01, 2021

  • Quickly find the time spent in compilation, execution, and result fetching for a query in Query History. See Query profile. Previously this information was only available by clicking a query and opening the Execution Details tab.

  • SQL endpoints no longer scale beyond the maximum specified clusters. All clusters allocated to a SQL endpoint are recycled after 24 hours, which can create a brief window in which there is one additional cluster.

March 18, 2021

  • Autocomplete in the query editor now supports Databricks SQL syntax and is context and alias aware. See Construct a query.

  • JDBC and ODBC requests no longer fail with invalid session errors after the session times out on the server. BI clients are now able to seamlessly recover when session timeouts occur.

March 11, 2021

  • Administrators and users in workspaces newly enabled for Databricks SQL no longer automatically have access to Databricks SQL. To enable access to Databricks SQL, the administrator must:

    1. Go to the admin console.

    2. Click the Users tab.

    3. In the row for their account, click the Databricks SQL access checkbox.

    4. Click Confirm.

    5. Repeat steps 3 and 4 to grant users access to Databricks SQL or follow the instructions in Grant a group access to Databricks SQL to grant access to groups.

  • Easily create queries, dashboards, and alerts by selecting Create Icon Create > [Query | Dashboard | Alert] at the top of the sidebar.

  • Query Editor now saves drafts, and you can revert to a saved query. See Revert to a saved query.

  • You can no longer create external data sources.

  • The reliability of the SQL endpoint monitoring chart has been improved. The chart no longer intermittently shows spurious error messages.

March 04, 2021

  • The Queries and Dashboards API documentation is now available. See Queries and Dashboards API 2.0.

  • Scheduled dashboard refreshes are now always performed. The refreshes are performed in the web application, so you no longer need to keep the dashboard open in a browser. See Automatically refresh a dashboard.

  • New SQL endpoints created using the SQL Endpoints API now have Auto Stop enabled with a default timeout of two hours.

  • Tableau Online users can now connect to SQL endpoints. See the new Tableau Online quickstart.

  • SQL endpoints no longer fail to launch due to inadequate AWS resources in a single availability zone.

February 26, 2021

The new Power BI connector for Azure Databricks, released in public preview in September 2020, is now GA. It provides:

  • Simple connection configuration: the new Power BI Databricks connector is integrated into Power BI, and you configure it using a simple dialog with a couple of clicks.

  • Faster imports and optimized metadata calls, thanks to the new Databricks ODBC driver, which comes with significant performance improvements.

  • Access to Databricks data through Power BI respects Databricks table access control.

For more information, see Power BI.

February 25, 2021

  • Setting permissions on a SQL endpoint is now faster. It’s a step right after you create a new SQL endpoint and easily accessible when you edit an existing endpoint. See Create a SQL endpoint and Convert a Classic SQL endpoint to a Serverless SQL endpoint.

  • To reuse visualization settings you can now duplicate a visualization. See Clone a visualization.

  • Query results are now stored in your account instead of the Databricks account.

  • To prevent leaking information by listing all defined permissions on an object, to run SHOW GRANTS [<user>] <object> you must be either:

    • A Databricks SQL administrator or the owner of <object>.

    • The user specified in [<user>].

January 07, 2021

  • To reduce spending on idle endpoints, new SQL endpoints now have Auto Stop enabled with a default timeout of two hours. After the timeout is reached, the endpoint is stopped. You can edit the timeout period or disable Auto Stop at any time.

  • Except for TEXT type query parameters, quotation marks are no longer added to query parameters. If you have used Dropdown List, Query Based Dropdown List, or any Date type query parameters, you must add quotation marks in order for the query to work. For example, if your query is SELECT {{ d }}, now this query must be SELECT '{{ d }}'.

November 18, 2020

Databricks is pleased to introduce the Public Preview of Databricks SQL, an intuitive environment for running ad-hoc queries and creating dashboards on data stored in your data lake. Databricks SQL empowers your organization to operate a multi-cloud lakehouse architecture that provides data warehousing performance with data lake economics. Databricks SQL:

  • Integrates with the BI tools you use today, like Tableau and Microsoft Power BI, to query the most complete and recent data in your data lake.

  • Complements existing BI tools with a SQL-native interface that allows data analysts and data scientists to query data lake data directly within Databricks.

  • Enables you to share query insights through rich visualizations and drag-and-drop dashboards with automatic alerting for important data changes.

  • Uses SQL endpoints to bring reliability, quality, scale, security, and performance to your data lake, so you can run traditional analytics workloads using your most recent and complete data.

  • Introduces the USAGE privilege to simplify data access administration. In order to use an object in a database, you must be granted the USAGE privilege on that database in addition to any privileges you need to perform the action. The USAGE privilege can be granted to databases or to the catalog. For workspaces that already use table access control, the USAGE privilege is granted automatically to the users group on the root CATALOG. See Data access control for details.

See the Databricks SQL guide for details. Contact your Databricks representative to request access.

Fixed issues

  • SQL Editor. The SQL editor will now persist selected text and scroll position when switching between query tabs.

  • SQL Editor. If you click ‘Run’ on a query in the SQL Editor, then navigate to another page and return while the query is still executing, the editor will display the correct query state. If the query completes while you are on another page, query results will be available on return to the SQL Editor page.

  • You can now use MySQL 8.0 as an external metastore.

  • DESCRIBE DETAIL commands on Delta tables no longer fail with java.lang.ClassCastException: java.sql.Timestamp cannot be cast to java.time.Instant.

  • Reading Parquet files with INT96 timestamps no longer fails.

  • When a user has Can Run permission on a query and runs it, if the query was created by another user, the query history displays the runner of the query as the user.

  • Null values are now ignored when rendering a chart, improving the usability of charts. For example, previously, bars in a bar chart would look very small when null values were present. Now the axes are set based on non-null values only.

Known issues

  • Reads from data sources other than Delta Lake in multi-cluster load balanced SQL endpoints can be inconsistent.

  • Delta tables accessed in Databricks SQL upload their schema and table properties to the configured metastore. If you are using an external metastore, you will be able to see Delta Lake information in the metastore. Delta Lake tries to keep this information as up-to-date as possible on a best-effort basis. You can also use the DESCRIBE <table> command to ensure that the information is updated in your metastore.

  • Databricks SQL does not support zone offsets like ‘GMT+8’ as session time zones. The workaround is to use a region based time zone https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) like ‘Etc/GMT+8’ instead. See SET TIME ZONE for more information about setting time zones.

Frequently asked questions (FAQ)

How are Databricks SQL workloads charged?

Databricks SQL workloads are charged according to the SQL Compute SKU.

Where do SQL endpoints run?

Like Databricks clusters, Classic SQL endpoints are created and managed in your AWS account. Classic SQL endpoints manage SQL-optimized clusters automatically in your account and scale to match end-user demand.

Serverless SQL endpoints (Public Preview), on the other hand, use compute resources in the Databricks cloud account. Serverless SQL endpoints simplify SQL endpoint configuration and usage and accelerate launch times. The Serverless option is available only if it has been enabled for the workspace. For more information, see Enable Serverless SQL endpoints and Serverless compute.

I have been granted access to data using a cloud provider credential. Why can’t I access this data in Databricks SQL?

In Databricks SQL, all access to data is subject to data access control, and an administrator or data owner must first grant you the appropriate privileges.