SYNC
Applies to: Databricks SQL Databricks Runtime Unity Catalog only
Use the SYNC
command to upgrade external tables in Hive Metastore to external tables in Unity Catalog. You can also use SYNC
to upgrade Hive managed tables that are stored outside of Databricks workspace storage (sometimes called DBFS root) to external tables in Unity Catalog. You cannot use it to upgrade Hive managed tables stored in workspace storage. To upgrade those tables, use CREATE TABLE CLONE.
You can use SYNC
to create new tables in Unity Catalog from existing Hive Metastore tables as well as update the Unity Catalog tables when the source tables in Hive Metastore are modified.
The SYNC
command can be run at a schema level using the SYNC SCHEMA
syntax or for an individual table using the SYNC TABLE
syntax.
The command performs a write operation (ALTER TABLE
) to each source table it upgrades to add some additional table properties for its bookkeeping.
In case of Delta tables, to perform the write operation the cluster or SQL Warehouse that runs the command must have write access to the table location.
In Databricks Runtime 12.2 LTS or above, this behavior can be turned off by setting the Spark configuration spark.databricks.sync.command.disableSourceTableWrites
to true
before running the SYNC
command. When set to true
, SYNC
does not add new table properties and therefore might not detect if the table has previously been upgraded to Unity Catalog.
In that case, it exclusively relies on the table name to determine if the table has been previously upgraded to Unity Catalog.
If the source table has been renamed since the last SYNC command, the user needs to manually rename the destination table before re-running the SYNC command when the config is true
.
Important
When a SYNC
command is executed, the SET TBLPROPERTIES
operation adds a table property that indicates the target Unity Catalog external table reference. This operation computes a new Delta snapshot and adds a new entry to the table Delta log, writing to the target table path in cloud storage.
Syntax
SYNC { SCHEMA target_schema [AS EXTERNAL] FROM source_schema |
TABLE target_table [AS EXTERNAL] FROM source_table }
[SET OWNER principal]
[DRY RUN]
Parameters
SCHEMA
SYNC
all the tables within a schema.-
An existing schema in Unity Catalog within which the user is authorized to create tables.
-
An existing schema in the
hive_metastore
catalog, which is owned by the user.
-
TABLE
SYNC
an individual table.-
A new or existing table in Unity Catalog in a schema within which the user is authorized to create tables. If the table already exists it is replaced to match
source_table
, and the user must also own the table. If the table does not exist it will be created. -
An existing table in
hive_metastore
which the user owns.
-
-
Optionally set the owner of the upgraded tables in Unity Catalog to
principal
. The default owner is the current user. AS EXTERNAL
SYNC
a Hive managed table or schema that is stored outside of Databricks workspace storage (sometimes called DBFS root) to external tables in Unity Catalog. You cannot useAS EXTERNAL
to upgrade Hive managed tables stored in workspace storage.DRY RUN
When specified checks whether the
source_table
or tables withinsource_schema
can be upgraded without actually creating or upgrading the target tables. The command returnsDRY_RUN_SUCCESS
if a table can be upgraded.AS EXTERNAL
Starting Databricks Runtime 13.2 and above, this optional clause can be added to specify that managed tables in hive metastore are upgraded as external tables in Unity Catalog. When used withSYNC SCHEMA
, it applies to all the tables, including managed tables in thesource_schema.
Returns
A report with the following columns:
source_schema STRING
The name of the source schema. The schema is
NULL
if the source is an unsupported temporary view.source_name STRING NOT NULL
The name of the source table.
source_type STRING NOT NULL
The type of the table:
MANAGED
orEXTERNAL
target_catalog STRING NOT NULL
The target catalog in Unity Catalog where the table is synced.
target_schema STRING NOT NULL
The target schema in Unity Catalog where the table is synced.
target_name STRING NOT NULL
The name of the table in Unity Catalog to which the source table is synced. This name matches the source table name.
status_code STRING NOT NULL
A status code for the result of the
SYNC
command for the source table.description STRING
A descriptive message about the status of the sync command for the source table.
Common status codes returned by SYNC
The SYNC
command provides a unique status_code
field in the output for each table to be upgraded to the Unity Catalog representing the status of the upgrade.
Some common status codes along with the recommendations to address them are:
DRY_RUN_SUCCESS
: Dry run successful.The table can be upgraded to Unity Catalog using the
SYNC
command.DBFS_ROOT_LOCATION
: Table located in the Databricks Filesystem root.The table is located in the Databricks Filesystem root location. This is not supported in Unity Catalog. Copy the table data to the Unity Catalog location using a CREATE TABLE command with the
DEEP CLONE
option.EXTERNAL_TABLE_IN_MANAGED_LOCATION
: External table path cannot be under managed storage.The path given for the external table is within Unity Catalog manage storage. If the table needs to be under the managed storage, upgrade the table as a managed table using a CREATE TABLE command with the
DEEP CLONE
option or move the table location out of Unity Catalog managed storage.HIVE_SERDE
: The table is not eligible for an upgrade from Hive Metastore to Unity Catalog. Reason: Hive SerDe Table.Hive SerDe tables are not supported by Unity Catalog. Change the tables into Delta format and issue the
SYNC
command to upgrade.INVALID_DATASOURCE_FORMAT
: Datasource format not specified or is not supported.Use one of the supported data source formats: Delta, Parquet, CSV, JSON, ORC, TEXT
LOCATION_OVERLAP
: Input path overlaps with other external tables.The table location overlaps with other external tables. Use a different location for the table or remove the overlapping external tables.
MULTIPLE_EXT_LOCATIONS
: Input path contains other external locations.There are more than one external locations which are subdirectories of the provided table path. Check if the external locations within the table location are necessary.
MULTIPLE_TARGET_TABLE
: A different synced table already exists. Only one target table per source table is allowed.The source table was already synced to a different target table previously which is not allowed. To force the
SYNC
to a different table, remove the table propertyupgraded_to
from the source table or remove the previously synced table from Unity Catalog if it is not needed anymore.NOT_EXTERNAL
: Table is not eligible for upgrade from Hive Metastore to Unity Catalog. Reason: Not an external table.SYNC
command only supports migrating external tables to Unity Catalog. For managed tables, create a managed table in Unity Catalog using a CREATE TABLE command with theDEEP CLONE
option. Alternatively, use theAS EXTERNAL
clause with theSYNC
command to create an external table in Unity Catalog.READ_ONLY_CATALOG
: Data inside a Delta sharing catalog is read-only and cannot be modified or deleted.The chosen catalog is a delta sharing catalog which is read-only. Tables within a read-only catalog cannot be updated using the
SYNC
command.SUCCESS
: Table successfully synced.TABLE_ALREADY_EXISTS
: Target table already exists.A table with the same name as the chosen table already exists in Unity Catalog. Rename or remove the existing table in Unity Catalog and rerun the
SYNC
command.TEMP_TABLE_NOT_SUPPORTED
: Temporary tables or views are not supported.Temporary tables or views cannot be upgraded to Unity Catalog. To use temporary tables or views, recreate them in Unity Catalog using the SHOW CREATE TABLE command in Unity Catalog.
TIMEOUT
: Sync task timed out.The sync command task took more than 300 seconds to complete. Increase
spark.databricks.sync.command.task.timeout
to a higher value in seconds. Default value is 300. If the error persists contact support.VIEWS_NOT_SUPPORTED
: Views are not supported.Recreate the views manually using SHOW CREATE TABLE command in Unity Catalog.
Examples
-- Sync an existing hive metastore table hive_metastore.default.my_tbl to a Unity Catalog
-- table named main.default.my_tbl.
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
default my_tbl external main default my_tbl SUCCESS Table main.default.my_tbl synced.
-- Sync an existing managed hive metastore table hive_metastore.default.my_tbl to an external table named main.default.my_tbl in Unity Catalog.
> SYNC TABLE main.default.my_tbl AS EXTERNAL FROM hive_metastore.default.my_tbl;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
default my_tbl managed main default my_tbl SUCCESS Table main.default.my_tbl synced.
-- SYNC a table in DRY RUN mode to evaluate the upgradability of the hive metastore table.
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl DRY RUN;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- --------------- ---------------------------------
default my_tbl external main default my_tbl DRY_RUN_SUCCESS
-- SYNC all the eligible tables in schema hive_metastore.mydb to a Unity Catalog schema main.my_db_uc.
-- The upgraded tables in main.my_db_uc will be owned by alf@melmak.et
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db SET OWNER `alf@melmak.et`;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
...
-- DRY RUN mode of SYNC SCHEMA to evaluate all the tables in a schema
-- hive_metastore.mydb for upgrading to Unity Catalog.
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db DRY RUN;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
...
-- Sync all tables including managed tables in a schema hive_metastore.mydb
-- as external tables in Unity Catalog.
> SYNC SCHEMA main.my_db_uc AS EXTERNAL FROM hive_metastore.my_db;
source_schema source_name source_type target_catalog target_schema target_name status_code description
------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
...