Skip to main content

Convert a foreign table to a managed Unity Catalog table

Preview

This feature is in Public Preview and is only available to participating customers at this time. To participate in the preview, apply by filling out this form. This feature only supports converting foreign tables federated using HMS and Glue Federation.

This page describes how to use SET MANAGED to convert a foreign table to a managed table.

SET MANAGED overview

Use the SET MANAGED feature to convert a foreign table to a Unity Catalog managed table in Databricks. SET MANAGED offers the following benefits:

  • Retaining table history.
  • Keeping the same table configurations, including the same name, settings, permissions, and views.
  • Benefitting from predictive optimization. When converting a table to managed, predictive optimization is set to INHERIT, meaning it inherits the setting from the schema or catalog level. By default, this is currently disabled. You can enable predictive optimization on the converted table by running ALTER TABLE <table_name> ENABLE PREDICTIVE OPTIMIZATION, which sets it to ENABLED instead of INHERIT.

Prerequisites

  • Data format: The foreign table's data format must be Delta Lake. To perform a one-time conversion for Parquet, see Convert to Delta Lake.
  • Table type: The HMS table type must be an external HMS table. The command fails if the table is a managed HMS table.
  • Runtime: Databricks Runtime 17.3 or above
  • Permissions: OWNER or MANAGE permissions on the table and CREATE permission on the EXTERNAL LOCATION

Syntax

To convert your Unity Catalog foreign table to be Unity Catalog managed, run the following command:

SQL
ALTER TABLE source_table SET MANAGED {MOVE | COPY}

Parameters

  • source_table

    An existing foreign table in Unity Catalog. Foreign tables contain data and metadata managed by an external catalog. Before conversion, if you drop the source table in the external catalog, the foreign table is also dropped in Unity Catalog. After the table is converted to managed, dropping the source table in the external catalog does not affect the Unity Catalog managed table.

  • MOVE

    Converts the table to managed and disables access to the source table in the external catalog.

    • Access through the external catalog or path-based access fails after the table is converted. All readers and writers to the table must use the Unity Catalog namespace for access.

    • All readers and writers must use name-based access. For example:

      SQL
      SELECT * FROM catalog_name.schema_name.table_name;
    • Path-based access is not supported and fails after the table is converted. For example:

      SQL
      SELECT * FROM delta.`protocol://path/to/table`;
    • External (non-Databricks) clients must support reads to Unity Catalog managed tables. See Compatibility Mode.

    • Use the Access Insights dashboard to see whether readers and writers accessing your tables are Databricks Runtime or external non-Databricks clients.

    • Databricks readers and writers must use Databricks Runtime 15.4 LTS or above. If your readers or writers use Databricks Runtime 14.3 LTS or below, see Alternative option for readers and writers on Databricks Runtime 14.3 LTS or below.

    • Downtime can occur when readers or writers access the table during conversion. For more information, see Convert an external table to a managed Unity Catalog table.

    • If your command is interrupted, the table might be left as an external table since the conversion to managed was incomplete. To complete the command, run SET MANAGED again on the external table.

    • Predictive optimization is set to INHERIT unless you manually configured it. To check whether predictive optimization is enabled, see Check whether predictive optimization is enabled.

    • Rollback: To roll back the table migration and regain access to the source table in the external catalog, run the UNSET MANAGED command. The table becomes an external table after running the command. To return the table to a foreign table, drop the table and it will be re-federated as foreign in the next catalog sync.

      SQL
      ALTER TABLE catalog.schema.my_managed_table UNSET MANAGED
      warning

      You MUST run UNSET MANAGED before dropping the table. Dropping the table without running UNSET MANAGED first can leave your system in a bad state and may result in data loss or inconsistencies.

    • If you roll back, commits that you made to the external location between conversion and rollback are time-travelable by version but not by timestamp. Seven days after rollback, data in the managed location is deleted. For readers and writers in Databricks Runtime 15.4 LTS or above, table history for commits made after conversion but before rollback is time-travelable by version but not by timestamp.

    • After table conversion, you must:

      • Restart any streaming jobs (read or write) using the foreign table
      • Ensure that your readers and writers work with the managed table.
  • COPY

    Converts the table to managed without modifying or disabling access to the source table in the external catalog.

    • During conversion to managed, data from the source table is copied into the managed storage location defined for the foreign table, creating two separate copies: the new managed table and the source table in the external catalog.
    • Unlike MOVE where reads and writes fail, when using COPY, you are responsible for properly disabling reads and writes to the source table in the external catalog and ensuring that workloads have migrated to the new catalog.
    • Rollback: To roll back the table migration, you do not need to run the UNSET MANAGED command since the source table has not been disrupted in the external catalog. Drop the table and it will be re-federated as foreign in the next catalog sync.

Check conversion

You can confirm that your external table has been converted to a managed table:

SQL
DESCRIBE EXTENDED catalog_name.schema_name.table_name

If the table has been converted, the Type under col_name shows as MANAGED under data_type.

Known limitations

Converting foreign tables to managed tables has the following limitations:

  • Streaming clients: You must restart any streaming jobs after conversion.

  • Multiple cloud regions: If the default managed location of your Unity Catalog metastore, catalog, or schema is in a different cloud region from the storage location of the foreign table being converted, you can incur additional cross-region data transfer costs. The cloud provider imposes these charges outside of Databricks' control. To check the locations of your schema, catalog, and metastore, use the following commands:

    SQL
    -- Check schema location
    DESCRIBE SCHEMA EXTENDED catalog_name.schema_name;

    -- Check catalog location
    DESCRIBE CATALOG EXTENDED catalog_name;

    -- Check metastore location
    DESCRIBE METASTORE;

FAQ

Can I create tables as well as convert tables in a foreign catalog?

Yes, you can create external or managed tables in a foreign catalog. The behavior depends on the schema configuration:

  • For Glue or eHMS schemas, or for schemas with a managed location set in Unity Catalog: If you run CREATE TABLE foreign_catalog.schema.table, this creates a Unity Catalog managed or external table. The table is not pushed or synced to the external catalog.
  • For schemas from internal Hive metastore connections: If you try to create a table in a foreign schema, it still creates a foreign table and also creates a table in hive_metastore.
  • For legacy workspace Hive metastore: Since it is read and write federation, if you create a table in the foreign catalog, it also creates a table in the internal Hive metastore.

What if my foreign tables are DBFS-backed?

When converting a DBFS-backed table, we store the current mapping of the DBFS path to cloud path as the external table's cloud path location.

Can I convert at the schema or catalog level?

You can iterate through your tables in your schemas to convert individually, or utilize the discoverx labs project to convert entire schemas or catalogs at once:

Python
df = (dx.from_tables("prod.*.*")
.with_sql("ALTER TABLE {full_table_name} SET MANAGED;")
.apply())