August 25, 2022
For supported regions, see the limitations section of these release notes.
For more information about Unity Catalog, see Overview of Unity Catalog.
A metastore can have up to 1000 catalogs.
A catalog can have up to 10,000 schemas.
A schema can have up to 10,000 tables.
For full Unity Catalog quotas, see Resource quotas for Unity Catalog.
All managed Unity Catalog tables store data with Delta Lake
External Unity Catalog tables and external locations support Delta Lake, JSON, CSV, Avro, Parquet, ORC, and text data.
Use the Databricks accounts console UI to:
Manage the metastore lifecycle (create, update, delete, and view Unity Catalog-managed metastores)
Assign and remove metastores for workspaces
Unity Catalog has two supported access modes when you create a new cluster:
Languages: SQL or Python
A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other’s data and credentials.
Lanaguage: SQL, Scala, Python, R
A secure cluster that can be used exclusively by a specified single user.
All SQL warehouses are created in shared access mode. To enable Unity Catalog for Databricks SQL, select the Preview channel when configuring a SQL warehouse. See Databricks SQL release notes.
For more information about cluster access modes, see Create clusters and SQL warehouses that can access Unity Catalog.
information_schema is fully supported for Unity Catalog data assets. Each metastore includes a catalog referred to as
system that includes a metastore scoped
information_schema. See Information schema. You can use
information_schema to answer questions like the following:
“Count the number of tables per catalog”
SELECT table_catalog, count(table_name) FROM system.information_schema.tables GROUP BY 1 ORDER by 2 DESC
“Show me all of the tables that have been altered in the last 24 hours“
SELECT table_name, table_owner, created_by, last_altered, last_altered_by, table_catalog FROM system.information_schema.tables WHERE datediff(now(), last_altered) < 1
Structured Streaming workloads are now supported with Unity Catalog. For details and limitations, see the limitations section of these release notes. See Using Unity Catalog with Structured Streaming.
User-defined SQL functions are now fully supported on Unity Catalog. For information about how to create and use SQL UDFs, see CREATE FUNCTION (SQL).
Standard data definition and data definition language commands are now supported in Spark SQL for external locations, including the following:
CREATE | DROP | ALTER | DESCRIBE | SHOW EXTERNAL LOCATION
You can also manage and view permissions with
SHOW for external locations with SQL. See External locations (Databricks SQL).
CREATE EXTERNAL LOCATION <your_location_name> URL `<your_location_path>' WITH (CREDENTIAL <your_credential_name>); GRANT READ_FILE ON EXTERNAL LOCATION <your_location_name> TO <group>;
Scala, R, and workloads using the Machine Learning Runtime are supported only on clusters using the single user access mode. Workloads in these languages do not support the use of dynamic views for row-level or column-level security.
Shallow clones are not supported when using Unity Catalog as the source or target of the clone.
Bucketing is not supported for Unity Catalog tables. If you run commands that try to create a bucketed table in Unity Catalog, it will throw an exception.
Writing to the same path or Delta Lake table from workspaces in multiple regions can lead to unreliable performance if some clusters access Unity Catalog and others do not.
Overwrite mode for DataFrame write operations into Unity Catalog is supported only for Delta tables, not for other file formats. The user must have the
CREATEprivilege on the parent schema and must be the owner of the existing object.
Streaming currently has the following limitations:
It is not supported in clusters using shared access mode. For streaming workloads, you must use single user access mode.
Asynchronous checkpointing is not yet supported.
Streaming queries lasting more than 30 days on all purpose or jobs clusters will throw an exception. For long running streaming queries, configure automatic job retries.
Referencing Unity Catalog tables from Delta Live Tables pipelines is currently not supported.
Groups previously created in a workspace cannot be used in Unity Catalog GRANT statements. This is to ensure a consistent view of groups that can span across workspaces. To use groups in GRANT statements, create your groups in the account console and update any automation for principal or group management (such as SCIM, Okta and AAD connectors, and Terraform) to reference account endpoints instead of workspace endpoints.
Unity Catalog requires the E2 version of the Databricks platform. All new Databricks accounts and most existing accounts are on E2. If you are unsure which account type you have, contact your Databricks representative.
At the time that Unity Catalog was declared GA, Unity Catalog was available in the following regions.
To use Unity Catalog in another region, contact your account representative.