MSCK REPAIR TABLE

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

This command updates Delta table metadata to the Unity Catalog service. For non-Delta tables, it repairs the table’s partitions and updates the Hive metastore.

  • Delta tables: When executed with Delta tables using the SYNC METADATA argument, this command reads the delta log of the target table and updates the metadata info to the Unity Catalog service.

  • Non-Delta tables: When executed with non-Delta tables, this command recovers all the partitions in the directory of a non-Delta table and updates the Hive metastore.

Syntax

MSCK REPAIR TABLE table_name
{
    [ {ADD | DROP | SYNC} PARTITIONS] | SYNC METADATA
}

Parameters

  • table_name

    The name of the table. If the table cannot be found Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.

  • ADD or DROP or SYNC PARTITIONS

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 10.0 and above

    When creating a non-Delta table using the PARTITIONED BY clause, partitions are generated and registered in the Hive metastore. However, if you create the partitioned table from existing data, partitions are not registered automatically in the Hive metastore. Run MSCK REPAIR TABLE to register the partitions.

    Another way to recover partitions is to use ALTER TABLE RECOVER PARTITIONS.

    If the table is cached, the command clears the table’s cached data and all dependents that refer to it. The cache fills the next time the table or dependents are accessed.

    • ADD command adds new partitions to the session catalog for all sub-folders in the base table folder that don’t belong to any table partitions. ADD is the default argument if no other option is specified.

    • DROP command drops all partitions from the session catalog that have non-existing locations in the file system.

    • SYNC is the combination of DROP and ADD.

  • SYNC METADATA

    Reads the delta log of the target table and updates the metadata info in the Unity Catalog service. To run this command, you must have MODIFY and SELECT privileges on the target table and USAGE of the parent schema and catalog. This command with this argument will fail if the target table is not stored in Unity Catalog.

Examples (non-Delta Lake tables)

-- create a partitioned table from existing data /tmp/namesAndAges.parquet
> CREATE TABLE t1 (name STRING, age INT) USING parquet PARTITIONED BY (age)
  LOCATION "/tmp/namesAndAges.parquet";

-- SELECT * FROM t1 does not return results
> SELECT * FROM t1;

-- run MSCK REPAIR TABLE to recovers all the partitions
> MSCK REPAIR TABLE t1;

-- SELECT * FROM t1 returns results
> SELECT * FROM t1;
    name age
  ------- ---
  Michael  20
   Justin  19
     Andy  30

Example (Unity Catalog table)

-- run MSCK REPAIR TABLE t1 SYNC METADATA to update the metadata info to Unity Catalog service
> MSCK REPAIR TABLE t1 SYNC METADATA