REPAIR TABLE

Applies to: Databricks SQL Databricks Runtime

This command repairs or modifies partitions for tables.

Use the SYNC METADATA clause with Delta Lake to update the catalog service based on table metadata, or to generate Apache Iceberg metadata for tables enabled for Iceberg reads. Do not use SYNC METADATA when you are modifying partitions for non-Delta Lake tables.

Supported functionality depends on whether tables are registered in Unity Catalog or Hive metastore and whether tables are backed by Delta Lake.

Syntax

[ MSCK ] REPAIR TABLE table_name
{
    [ {ADD | DROP | SYNC} PARTITIONS] | SYNC METADATA
}

For Databricks SQL Databricks Runtime 12.2 LTS and above, MSCK is optional.

Parameters

table_name

The name of the table. If the table cannot be found Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
ADD or DROP or SYNC PARTITIONS

Applies to: Databricks SQL Databricks Runtime 10.4 LTS and above

When creating a non-Delta table using the PARTITIONED BY clause, partitions are generated and registered in the Hive metastore. However, if you create the partitioned table from existing data, partitions are not registered automatically in the Hive metastore. Run MSCK REPAIR TABLE to register the partitions.

Another way to recover partitions is to use ALTER TABLE RECOVER PARTITIONS.

If the table is cached, the command clears the table's cached data and all dependents that refer to it. The cache fills the next time the table or dependents are accessed.
- ADD command adds new partitions to the session catalog for all sub-folders in the base table folder that don't belong to any table partitions. ADD is the default argument if no other option is specified.
- DROP command drops all partitions from the session catalog that have non-existing locations in the file system.
- SYNC is the combination of DROP and ADD.
SYNC METADATA

Delta Lake only.

If the configuration spark.databricks.delta.catalog.update.enabled is set to true, target table metadata updates are automatically synced to the catalog service. Otherwise, a manual sync using REPAIR TABLE table_name SYNC METADATA may be required.

Reads the transaction log of the target table and updates the metadata info in the catalog service. To run this command, you must have MODIFY and SELECT privileges on the target table and USE SCHEMA and USE CATALOG privileges on the parent schema and catalog.

This argument works with Hive metastore in Databricks Runtime 16.1 and above. For Hive metastore tables, you must have USAGE and MODIFY privileges.

If Delta UniForm is enabled (requires Unity Catalog), SYNC METADATA triggers manual conversion of current Delta metadata to Iceberg metadata and syncs the latest Iceberg version for the Unity Catalog Iceberg endpoint. See Read Delta Lake tables with Iceberg clients using UniForm.

You can use REPAIR TABLE table_name SYNC METADATA to apply the Unity Catalog permissions model to shallow clones that you are reading from a foreign catalog that was created using Hive metastore federation. See Working with shallow clones.

Examples (non-Delta Lake tables)

SQL
-- create a partitioned table from existing data /tmp/namesAndAges.parquet
> CREATE TABLE t1 (name STRING, age INT) USING parquet PARTITIONED BY (age)
  LOCATION "/tmp/namesAndAges.parquet";

-- SELECT * FROM t1 does not return results
> SELECT * FROM t1;

-- run MSCK REPAIR TABLE to recovers all the partitions
> MSCK REPAIR TABLE t1;

-- SELECT * FROM t1 returns results
> SELECT * FROM t1;
    name age
  ------- ---
  Michael  20
   Justin  19
     Andy  30

Example (Unity Catalog table)

SQL
-- run MSCK REPAIR TABLE t1 SYNC METADATA to update the metadata info to Unity Catalog service
> MSCK REPAIR TABLE t1 SYNC METADATA

Syntax​

Parameters​

Examples (non-Delta Lake tables)​

Example (Unity Catalog table)​

Related articles​

Syntax

Parameters

Examples (non-Delta Lake tables)

Example (Unity Catalog table)

Related articles