SYNC

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

The SYNC command is used to upgrade external tables in Hive Metastore to external tables in Unity Catalog. You can use it to create new tables in Unity Catalog from existing Hive Metastore tables as well as update the Unity Catalog tables when the source tables in Hive Metastore are changed.

The SYNC command can be run at a schema level using the SYNC SCHEMA syntax or for an individual table using the SYNC TABLE syntax.

Preview

This feature is in Public Preview.

Syntax

SYNC { SCHEMA target_schema FROM source_schema |
       TABLE target_table FROM source_table }
  [SET OWNER principal]
  [DRY RUN]

Parameters

  • SCHEMA

    SYNC all the tables within a schema.

    • target_schema

      An existing schema in Unity Catalog within which the user is authorized to create tables.

    • source_schema

      An existing schema in the hive_metastore catalog, which is owned by the user.

  • TABLE

    SYNC an individual table.

    • target_table

      A new or existing table in Unity Catalog in a schema within which the user is authorized to create tables. If the table already exists it is replaced to match source_table, and the user must also own the table. If the table does not exist it will be created.

    • source_table

      An existing table in hive_metastore which the user owns.

  • principal

    Optionally set the owner of the upgraded tables in Unity Catalog to principal. The default owner is the current user.

  • DRY_RUN

    When specified checks whether the source_table or tables within source_schema can be upgraded without actually creating or upgrading the target tables. The command returns DRY_RUN_SUCCESS if a table can be upgraded.

Returns

A report with the following columns:

  • source_schema STRING

    The name of the source schema. The schema is NULL if the source is an unsupported temporary view.

  • source_name STRING NOT NULL

    The name of the source table.

  • source_type STRING NOT NULL

    The type of the table: MANAGED or EXTERNAL

  • target_catalog STRING NOT NULL

    The target catalog in Unity Catalog where the table is synced.

  • target_schema STRING NOT NULL

    The target schema in Unity Catalog where the table is synced.

  • target_name STRING NOT NULL

    The name of the table in Unity Catalog to which the source table is synced. This name matches the source table name.

  • status_code STRING NOT NULL

    A status code for the result of the SYNC command for the source table.

  • description STRING

    A descriptive message about the status of the sync command for the source table.

Common status codes returned by SYNC

The SYNC command provides a unique status_code field in the output for each table to be upgraded to the Unity Catalog representing the status of the upgrade. Some common status codes along with the recommendations to address them are:

  • DRY_RUN_SUCCESS: Dry run successful.

    The table can be upgraded to Unity Catalog using the SYNC command.

  • DBFS_ROOT_LOCATION: Table located in the Databricks Filesystem root.

    The table is located in the Databricks Filesystem root location. This is not supported in Unity Catalog. Copy the table data to the Unity Catalog location using a CREATE TABLE command with the DEEP CLONE option.

  • EXTERNAL_TABLE_IN_MANAGED_LOCATION: External table path cannot be under managed storage.

    The path given for the external table is within Unity Catalog manage storage. If the table needs to be under the managed storage, upgrade the table as a managed table using a CREATE TABLE command with the DEEP CLONE option or move the table location out of Unity Catalog managed storage.

  • HIVE_SERDE: The table is not eligible for an upgrade from Hive Metastore to Unity Catalog. Reason: Hive SerDe Table.

    Hive SerDe tables are not supported by Unity Catalog. Change the tables into Delta format and issue the SYNC command to upgrade.

  • INVALID_DATASOURCE_FORMAT: Datasource format not specified or is not supported.

    Use one of the supported data source formats: Delta, Parquet, CSV, JSON, ORC, TEXT

  • LOCATION_OVERLAP: Input path overlaps with other external tables.

    The table location overlaps with other external tables. Use a different location for the table or remove the overlapping external tables.

  • MULTIPLE_EXT_LOCATIONS: Input path contains other external locations.

    There are more than one external locations which are subdirectories of the provided table path. Check if the external locations within the table location are necessary.

  • MULTIPLE_TARGET_TABLE: A different synced table already exists. Only one target table per source table is allowed.

    The source table was already synced to a different target table previously which is not allowed. To force the SYNC to a different table, remove the table property upgraded_to from the source table or remove the previously synced table from Unity Catalog if it is not needed anymore.

  • NOT_EXTERNAL: Table is not eligible for upgrade from Hive Metastore to Unity Catalog. Reason: Not an external table.

    SYNC command only supports migrating external tables to Unity Catalog. For managed tables, create a managed table in Unity Catalog using a CREATE TABLE command with the DEEP CLONE option.

  • READ_ONLY_CATALOG: Data inside a Delta sharing catalog is read-only and cannot be modified or deleted.

    The chosen catalog is a delta sharing catalog which is read-only. Tables within a read-only catalog cannot be updated using the SYNC command.

  • SUCCESS: Table successfully synced.

  • TABLE_ALREADY_EXISTS: Target table already exists.

    A table with the same name as the chosen table already exists in Unity Catalog. Rename or remove the existing table in Unity Catalog and rerun the SYNC command.

  • TEMP_TABLE_NOT_SUPPORTED: Temporary tables or views are not supported.

    Temporary tables or views cannot be upgraded to Unity Catalog. To use temporary tables or views, recreate them in Unity Catalog using the SHOW CREATE TABLE command in Unity Catalog.

  • TIMEOUT: Sync task timed out.

    The sync command task took more than 300 seconds to complete. Increase spark.databricks.sync.command.task.timeout to a higher value in seconds. Default value is 300. If the error persists contact support.

  • VIEWS_NOT_SUPPORTED: Views are not supported.

    Recreate the views manually using SHOW CREATE TABLE command in Unity Catalog.

Examples

- Sync an existing hive metastore table hive_metastore.default.my_tbl - to a Unity Catalog table named main.default.my_tbl
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl;
  source_schema source_name source_type target_catalog target_schema target_name status_code description
  ------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
  default       my_tbl      external    main           default       my_tbl      SUCCESS     Table main.default.my_tbl synced.

- SYNC a table in DRY RUN mode to evaluate the upgradability of the hive metastore table.
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl DRY RUN;
  source_schema source_name source_type target_catalog target_schema target_name status_code     description
  ------------- ----------- ----------- -------------- ------------- ----------- --------------- ---------------------------------
  default       my_tbl      external    main           default       my_tbl      DRY_RUN_SUCCESS

- SYNC all the eligible tables in schema hive_metastore.mydb to a Unity Catalog schema main.my_db_uc.
- The upgraded tables in main.my_db_uc will be owned by alf@melmak.et
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db SET OWNER `alf@melmak.et`;
  source_schema source_name source_type target_catalog target_schema target_name status_code description
  ------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
  ...

- DRY RUN mode of SYNC SCHEMA to evaluate all the tables in a schema
- hive_metastore.mydb for upgrading to Unity Catalog.
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db DRY RUN;
  source_schema source_name source_type target_catalog target_schema target_name status_code     description
  ------------- ----------- ----------- -------------- ------------- ----------- -----------     ---------------------------------
  ...