Skip to main content

Delta tables in Databricks

Tables backed by Delta Lake are known as Delta tables. A Delta table stores data as a directory of files in cloud object storage and registers its metadata to the metastore within a catalog and schema. Delta Lake is the default table format in Databricks, so most references to “tables” refer to Delta tables unless explicitly stated otherwise. See See What is Delta Lake in Databricks?.

Databricks recommends using fully qualified table names instead of file paths when interacting with Delta tables. Although you can create tables that don’t use Delta Lake, those tables lack the transactional guarantees and performance optimizations of Delta tables.

The following table describes common Delta table types you might encounter in Databricks:

Table type

Description

Unity Catalog managed table

Always backed by Delta Lake. The default and recommended table type on Databricks. Provides many built-in optimizations.

Unity Catalog external table

Optionally backed by Delta Lake. Supports some legacy integration patterns with external Delta Lake clients.

Unity Catalog foreign table

Might be backed by Delta Lake, depending on the foreign catalog. Foreign tables backed by Delta Lake do not have many optimizations present in Unity Catalog managed tables.

Streaming table

A Lakeflow Declarative Pipelines dataset backed by Delta Lake that includes an append or AUTO CDC ... INTO flow definition for incremental processing.

Feature table

Any Unity Catalog managed table or external table with a primary key declared. Used in ML workloads on Databricks.

Hive metastore table

Foreign tables in an internal or external federated Hive metastore and tables in the legacy workspace Hive metastore. Both managed and external Hive metastore tables can optionally be backed by Delta Lake.

Materialized view

A Lakeflow Declarative Pipelines dataset backed by Delta Lake that materializes the results of a query using managed flow logic.

Legacy table types

The following legacy table types are supported for backward compatibility but are not recommended for new development.

Hive tables

Hive tables describe tables implemented using legacy patterns, including the legacy Hive metastore, Hive SerDe codecs, or Hive SQL syntax.

Tables registered using the legacy Hive metastore store data in the legacy DBFS root, by default. Databricks recommends migrating all tables from the legacy HMS to Unity Catalog. See Database objects in the legacy Hive metastore.

You can optionally federate a Hive metastore to Unity Catalog. See Hive metastore federation: enable Unity Catalog to govern tables registered in a Hive metastore.

Apache Spark supports registering and querying Hive tables, but these codecs are not optimized for Databricks. Databricks recommends registering Hive tables only to support queries against data written by external systems. See Hive table (legacy).

Live tables

The term live tables refers to an earlier implementation of functionality now implemented as materialized views. Any legacy code that references live tables should be updated to use syntax for materialized views. See Lakeflow Declarative Pipelines and Materialized views.