Skip to main content

Unity Catalog managed tables in Databricks for Delta Lake and Apache Iceberg

Preview

Unity Catalog managed tables is generally available for Delta Lake tables. For Apache Iceberg tables, this feature is in Public Preview and available in Databricks Runtime 16.4 LTS and above.

This page describes Unity Catalog managed tables in Delta Lake and Apache Iceberg, the default and recommended table type in Databricks. These tables are fully governed and optimized by Unity Catalog, offering performance, operational advantages, and lower storage and compute costs compared to external and foreign tables, because managed tables learn from your read and write pattern. Unity Catalog manages all read, write, storage, and optimization responsibilities for managed tables. See Convert an external table to a managed Unity Catalog table.

Data files for managed tables are stored in the schema or catalog containing them. See Specify a managed storage location in Unity Catalog.

Databricks recommends using managed tables to take advantage of:

  • Reduced storage and compute costs.
  • Faster query performance across all client types.
  • Automatic table maintenance and optimization.
  • Secure access for non-Databricks clients via open APIs.
  • Support for Delta Lake and Iceberg formats.
  • Automatic upgrades to the latest platform features.

Managed tables support interoperability by allowing access from Delta Lake and Iceberg clients. Through open APIs and credential vending, Unity Catalog enables external engines such as Trino, DuckDB, Apache Spark, Daft, and Iceberg REST catalog-integrated engines like Dremio to access managed tables. Delta Sharing, an open source protocol, enables secure, governed data sharing with external partners and platforms.

You can work with managed tables across all languages and products supported in Databricks. You need certain privileges to create, update, delete, or query managed tables. See Manage privileges in Unity Catalog.

All reads and writes to managed tables must use table names and catalog and schema names where they exist (for example, catalog_name.schema_name.table_name).

note

This page focuses on Unity Catalog managed tables. For managed tables in the legacy Hive metastore, see Database objects in the legacy Hive metastore.

Why use Unity Catalog managed tables?

Unity Catalog managed tables automatically optimize storage costs and query speeds using AI-driven technologies like automatic clustering, file size compaction, and intelligent statistics collection. These tables simplify data management with features like automatic vacuuming and metadata caching, while ensuring interoperability with Delta and Iceberg third-party tools.

The following features are unique to Unity Catalog managed tables, and are not available for external tables and foreign tables.

Feature

Benefits

Enabled by default?

Configurable?

Predictive optimization

Automatically optimizes your data layout and compute using AI, so you don't need to manually handle operations for managed tables. Databricks recommends enabling predictive optimization for all managed tables to reduce storage and compute costs.

Predictive optimization automatically runs:

Yes, for all new accounts created on or after November 11, 2024.

For current accounts, Databricks is starting to roll out predictive optimization by default. See Check whether predictive optimization is enabled.

Yes. See Enable predictive optimization.

Automatic liquid clustering

For tables with predictive optimization, enabling automatic liquid clustering allows Databricks to intelligently select clustering keys. As query patterns change, Databricks automatically updates clustering keys to improve performance and lower costs.

No

Yes. See Enable liquid clustering.

Metadata caching

In-memory caching of transaction metadata enhances query performance by minimizing requests to the transaction log stored in the cloud. This feature enhances query performance.

Yes

No. Metadata caching is always enabled for managed tables.

Automatic file deletion after a DROP TABLE command

If you DROP a managed table, Databricks deletes the data in cloud storage after 7 days, reducing storage costs. For external tables, you must manually go to your storage bucket and delete the files.

Yes

No. For managed tables, files are always deleted automatically after 7 days.

Create a managed table

To create a managed table, you must have:

  • USE SCHEMA on the table's parent schema.
  • USE CATALOG on the table's parent catalog.
  • CREATE TABLE on the table's parent schema.

Use the following SQL syntax to create an empty managed table using SQL. Replace the placeholder values:

  • <catalog-name>: The name of the catalog that will contain the table.
  • <schema-name>: The schema's name containing the table.
  • <table-name>: A name for the table.
  • <column-specification>: Each column's name and data type.
SQL
-- Create a managed Delta table
CREATE TABLE <catalog-name>.<schema-name>.<table-name>
(
<column-specification>
);

-- Create a managed Iceberg table
CREATE TABLE <catalog-name>.<schema-name>.<table-name>
(
<column-specification>
)
USING iceberg;

To maintain performance on reads and writes, Databricks periodically runs operations to optimize managed Iceberg table metadata. This task is performed using serverless compute, which has MODIFY permissions on the Iceberg table. This operation only writes to the table's metadata, and the compute only maintains permissions to the table for the duration of the job.

note

To create an Iceberg table, explicitly specify USING iceberg. Otherwise, Databricks creates a Delta Lake table by default.

You can create managed tables from query results or DataFrame write operations. The following articles demonstrate some of the many patterns you can use to create a managed table on Databricks:

Drop a managed table

To drop a managed table, you must have:

  • MANAGE on the table or you must be the table owner.
  • USE SCHEMA on the table's parent schema.
  • USE CATALOG on the table's parent catalog.

To drop a managed table, run the following SQL command:

SQL
DROP TABLE IF EXISTS catalog_name.schema_name.table_name;

Unity Catalog supports the UNDROP TABLE command to recover dropped managed tables for 7 days. After 7 days, Databricks marks the underlying data for deletion from your cloud tenant and removes files during automated table maintenance. See UNDROP.