What is Lakehouse Federation?

Lakehouse Federation is the query federation platform for Databricks. The term query federation describes a collection of features that enable users and systems to run queries against multiple data sources without needing to migrate all data to a unified system.

There are two types of federation: query federation and catalog federation. This page covers the differences between the types.

Query federation compared to catalog federation

The following table summarizes the key differences between query federation and catalog federation:

Attribute	Query federation	Catalog federation
Query path	Unity Catalog queries are pushed down to the foreign database using JDBC. The query is run both in Databricks and using remote compute.	Unity Catalog queries directly access the foreign table in object storage. Catalog federation is available for platforms that support direct access to their catalog and storage services. The query is only run on Databricks compute, meaning that catalog federation is more cost-effective and performance-optimized than query federation.
Use case	You need ad hoc reporting or proof-of-concept access to operational data stored in external databases. You want to minimize data movement and maintain live access to external systems. When your source supports both Lakehouse Federation and Lakeflow Connect, Databricks recommends Lakeflow Connect if performance on higher data volumes and lower latency are priorities.	You’re migrating to Unity Catalog but need to incrementally phase in data managed from a foreign catalog. You want a long-term hybrid model in which some data stays in an external catalog and some data is managed by Unity Catalog.
Overview of steps	Create a connection in Unity Catalog with your access credentials and JDBC URL. Create a foreign catalog using the connection. Grant privileges to users on tables in the foreign catalog. Run queries. These are pushed down to the external database.	Create a connection in Unity Catalog for accessing the external catalog. Create a storage credential and an external location for the table paths. Create a foreign catalog using the connection and the external location. Grant privileges to users on tables in the foreign catalog. Run queries. These run directly against the object storage.

Attribute

Query federation

Catalog federation

Query path

Unity Catalog queries are pushed down to the foreign database using JDBC. The query is run both in Databricks and using remote compute.

Unity Catalog queries directly access the foreign table in object storage. Catalog federation is available for platforms that support direct access to their catalog and storage services. The query is only run on Databricks compute, meaning that catalog federation is more cost-effective and performance-optimized than query federation.

Use case

You need ad hoc reporting or proof-of-concept access to operational data stored in external databases.
You want to minimize data movement and maintain live access to external systems.

When your source supports both Lakehouse Federation and Lakeflow Connect, Databricks recommends Lakeflow Connect if performance on higher data volumes and lower latency are priorities.

You’re migrating to Unity Catalog but need to incrementally phase in data managed from a foreign catalog.
You want a long-term hybrid model in which some data stays in an external catalog and some data is managed by Unity Catalog.

Overview of steps

Create a connection in Unity Catalog with your access credentials and JDBC URL.
Create a foreign catalog using the connection.
Grant privileges to users on tables in the foreign catalog.
Run queries. These are pushed down to the external database.

Create a connection in Unity Catalog for accessing the external catalog.
Create a storage credential and an external location for the table paths.
Create a foreign catalog using the connection and the external location.
Grant privileges to users on tables in the foreign catalog.
Run queries. These run directly against the object storage.

Supported data sources

Connect to the following sources using query federation:

Connect to the following sources using catalog federation:

AWS Glue metastore

Query federation compared to catalog federation​

Supported data sources​

Additional resources​

Query federation compared to catalog federation

Supported data sources

Additional resources