Work with foreign tables
Foreign tables, sometimes referred to as federated tables, are tables registered using Unity Catalog as part of a foreign catalog. Foreign tables contain data and metadata managed by external systems, with Unity Catalog adding data governance to query these tables.
Databricks supports the following methods for registering foreign tables:
- Lakehouse Federation uses secure JDBC connections to federate to external data systems such as PostgreSQL and MySQL. See What is Lakehouse Federation?.
- Hive metastore federation adds Unity Catalog data governance to tables managed by a Hive metastore. See Hive metastore federation: enable Unity Catalog to govern tables registered in a Hive metastore.
All tables in a foreign catalog are foreign tables, and foreign tables must reside in a foreign catalog.
For backwards compatibility with legacy Apache Spark and Databricks workloads, foreign tables in a federated Hive metastore return metadata from Hive metastore including whether the table is a Hive managed table or Hive external table.
Why use a foreign table?
Foreign tables provide flexibility when integrating Databricks with existing data systems or migrating from legacy systems.
Many foreign tables serve as a temporary solution for direct access to data not managed by Databricks, as they provide a quick solution without requiring data migration or code refactoring for upstream ETL workflows. Databricks recommends migrating datasets that drive production workloads or are queried frequently to Unity Catalog managed tables, as managed tables provide the best performance and have many built-in optimizations.
Lakehouse Federation provides a complimentary solution for loading data from external data systems not supported by LakeFlow Connect. Databricks recommends using materialized views to replicate foreign tables to Unity Catalog. See Load data from foreign tables with materialized views.
Create or write to foreign tables
If you have sufficient privileges and your workspace has been configured with an internal federated Hive metastore, you can create or write to foreign tables backed by an internal federated Hive metastore. External federated re:[HMS] and all foreign tables backed by Lakehouse Federation are read-only.
Databricks does not manage the metadata, data, or semantics for writes to foreign tables. Foreign tables might be backed by an ACID-compliant format such as Delta Lake, but foreign tables do not provide the transactional guarantees of Unity Catalog managed tables.
Most Databricks optimizations for query performance, enhanced write speed, data skipping, and metadata-only queries require Delta Lake and Unity Catalog. Databricks recommends comparing read and write query performance between foreign tables and Unity Catalog managed tables using the latest Databricks Runtime version to evaluate latency and cost differences.