Skip to main content

What is a table?

A table resides in a schema and contains rows of data. The default table type created in Databricks is a Unity Catalog managed table.

The primary differentiator for table types in Databricks is the owning catalog, as described in the following table:

Table type

Managing catalog

Managed

Unity Catalog

External

None

Foreign

An external system or catalog service

The following example shows a table named prod.people_ops_employees that contains data about five employees. The metadata is registered in Unity Catalog and the data is stored in cloud storage.

Example table containing employee data

Managed tables

Managed tables manage underlying data files alongside the metastore registration. Databricks recommends that you use managed tables whenever you create a new table. Unity Catalog managed tables are the default when you create tables in Databricks. They always use Delta Lake. See Work with managed tables.

External tables

External tables, sometimes called unmanaged tables, reference data stored outside of Databricks in an external storage system, such as cloud object storage. They decouple the management of underlying data files from metastore registration. Unity Catalog >supports external tables in several formats, including Delta Lake. Unity Catalog external tables can store data files using common formats readable by external systems. See Work with external tables.

Foreign tables

Foreign tables represent data stored in external systems connected to Databricks through Lakehouse Federation or Hive metastore federation. Foreign tables are read-only on Databricks. See Work with foreign tables.

Tables in Unity Catalog

In Unity Catalog, tables sit at the third level of the three-level namespace (catalog.schema.table), as shown in the following diagram.

Unity Catalog object model diagram, focused on table

Basic table permissions

Most table operations require USE CATALOG and USE SCHEMA permissions on the catalog and schema containing a table.

The following table summarizes the additional permissions needed for common table operations in Unity Catalog:

Operation

Permissions

Create a table

CREATE TABLE on the containing schema

Query a table

SELECT on the table

Update, delete, merge, or insert data to a table

SELECT and MODIFY on the table

Drop a table

MANAGE on the table

Replace a table

MANAGE on the table, CREATE TABLE on the containing schema

For more on Unity Catalog permissions, see Manage privileges in Unity Catalog.

What is a Delta table?

Tables backed by Delta Lake are also called Delta tables.

A Delta table stores data as a directory of files in cloud object storage and registers table metadata to the metastore within a catalog and schema. See What is Delta Lake?.

Delta Lake is the default format used whenever saving data or creating a table in Databricks. Because Delta tables are the default on Databricks, most references to tables describe the behavior of Delta tables unless otherwise noted.

Databricks recommends that you always interact with Delta tables using fully-qualified table names rather than file paths.

You can create tables on Databricks that don't use Delta Lake. These tables don't provide the transactional guarantees or optimized performance of Delta tables.

The following table describes common Delta tables you might encounter in Databricks:

Table type

Description

Unity Catalog managed table

Always backed by Delta Lake. The default and recommended table type on Databricks. Provides many built-in optimizations.

Unity Catalog external table

Can optionally be backed by Delta Lake. Supports some legacy integration patterns with external Delta Lake clients.

Streaming table

A DLT dataset backed by Delta Lake that includes an append or APPLY CHANGES INTO flow definition for incremental processing.

Materialized view

A DLT dataset backed by Delta Lake that materializes the results of a query using managed flow logic.

Feature tables

Any Unity Catalog managed table or external table with a primary key declared. Used in ML workloads on Databricks.

Unity Catalog foreign table

Might be backed by Delta Lake, depending on the foreign catalog. Foreign tables backed by Delta Lake do not have many optimizations present in Unity Catalog managed tables.

Hive metastore table

Hive metastore tables include foreign tables in an internal or external federated Hive metastore and tables in the legacy workspace Hive metastore. Both managed and external Hive metastore tables can optionally be backed by Delta Lake.

Other table types

While managed, external, and foreign tables are the fundamental table types in Databricks, some products, features, and syntax make further distinctions. This section describes some of these other tables.

Streaming tables

Streaming tables are Delta tables used for processing incremental data in DLT. Most updates to streaming tables happen through refresh operations.

You can register streaming tables in Unity Catalog using Databricks SQL or define them as part of a DLT pipeline. See How streaming tables work, Load data using streaming tables in Databricks SQL, and What is DLT?.

Feature tables

Any Delta table managed by Unity Catalog that has a primary key is a feature table. See Work with feature tables in Unity Catalog.

Online tables

An online table is a read-only copy of a Delta Table that is stored in row-oriented format optimized for online access. See Use online tables for real-time feature serving.

Hive tables (legacy)

Hive tables describe tables implemented using legacy patterns, including the legacy Hive metastore, Hive SerDe codecs, or Hive SQL syntax.

Tables registered using the legacy Hive metastore store data in the legacy DBFS root, by default. Databricks recommends migrating all tables from the legacy HMS to Unity Catalog. See Database objects in the legacy Hive metastore.

You can optionally federate a Hive metastore to Unity Catalog. See Hive metastore federation: enable Unity Catalog to govern tables registered in a Hive metastore.

Apache Spark supports registering and querying Hive tables, but these codecs are not optimized for Databricks. Databricks recommends registering Hive tables only to support queries against data written by external systems. See Hive table (legacy).

Live tables (deprecated)

The term live tables refers to an earlier implementation of functionality now implemented as materialized views. Any legacy code that references live tables should be updated to use syntax for materialized views. See What is DLT? and Use materialized views in Databricks SQL.