Databricks Unity Catalog table types
Unity Catalog supports three primary table types: managed, external, and foreign tables. Each type differs in how data is stored, managed, and governed.
Managed tables
Managed tables are the default and recommended table type. Unity Catalog manages the data lifecycle, storage location, and optimizations. When you drop a managed table, both the metadata and underlying data files are deleted.
Managed tables are backed by Delta Lake or Apache Iceberg and provide:
- Automatic optimization for reduced storage and compute costs
- Faster query performance across all client types
- Automatic table maintenance
- Secure access for non-Databricks clients via open APIs
- Automatic upgrades to the latest platform features
Data files are stored in the schema or catalog containing the table. See Unity Catalog managed tables in Databricks for Delta Lake and Apache Iceberg.
External tables
External tables register data stored in cloud object storage that you manage. Unity Catalog governs data access but doesn't manage data lifecycle, optimizations, or storage layout. When you drop an external table, only the metadata is removed—the underlying data files remain.
Unity Catalog external tables support Delta Lake format (recommended) and CSV, JSON, AVRO, PARQUET, ORC, and TEXT formats. Non-Delta external tables lack the transactional guarantees and performance optimizations of Delta Lake.
Use external tables when you need to:
- Register existing data that isn't compatible with Unity Catalog managed tables
- Provide direct data access from non-Databricks clients that don't support other external access patterns
See Work with external tables.
Foreign tables
Foreign tables (also called federated tables) are registered using Unity Catalog as part of a foreign catalog. External systems manage the data and metadata, while Unity Catalog adds data governance for querying.
Databricks supports two methods for registering foreign tables:
- Query federation: Uses secure JDBC connections to external data systems like PostgreSQL and MySQL
- Catalog federation: Connects external catalogs to query data directly in file storage
Foreign tables backed by Delta Lake lack many optimizations available in Unity Catalog managed tables. For production workloads or frequently queried datasets, migrate to Unity Catalog managed tables for better performance. See Work with foreign tables.
Comparison of table types
The following table compares the three table types:
Feature | Managed tables | External tables | Foreign tables |
|---|---|---|---|
Data lifecycle management | Unity Catalog manages | You manage | External system manages |
Storage location | Unity Catalog manages | You specify | External system manages |
Automatic optimizations | Yes | Limited | No |
Formats supported | Delta Lake, Apache Iceberg | Delta Lake (recommended), CSV, JSON, AVRO, PARQUET, ORC, TEXT | Depends on external system |
Data deleted on | Yes | No | No |
Best for | Production workloads, frequently queried data | Legacy integrations, existing data | Migration from external systems, temporary access |
Other table types
Databricks also supports specialized table types for specific use cases:
- Streaming tables: Lakeflow Spark Declarative Pipelines datasets backed by Delta Lake with incremental processing logic
- Materialized views: Lakeflow Spark Declarative Pipelines datasets backed by Delta Lake that materialize query results using managed flow logic
Legacy table types
The following legacy table types are supported for backward compatibility but aren't recommended for new development.
Hive tables
Hive tables use legacy patterns including the legacy Hive metastore, Hive SerDe codecs, or Hive SQL syntax. By default, tables registered using the legacy Hive metastore store data in the legacy DBFS root.
Migrate all tables from the legacy HMS to Unity Catalog. See Database objects in the legacy Hive metastore. You can optionally federate a Hive metastore to Unity Catalog. See Hive metastore federation: enable Unity Catalog to govern tables registered in a Hive metastore.
Apache Spark supports registering and querying Hive tables, but these codecs aren't optimized for Databricks. Register Hive tables only to support queries against data written by external systems. See Hive table (legacy).
Live tables
The term live tables refers to an earlier implementation of functionality now available as materialized views. Update legacy code that references live tables to use materialized view syntax. See Lakeflow Spark Declarative Pipelines and Materialized views.