Upgrade tables and views to Unity Catalog

Preview

Unity Catalog is in Public Preview. To participate in the preview, contact your Databricks representative.

To take advantage of Unity Catalog’s access control and auditing mechanisms, and to share data to multiple workspaces, you can upgrade tables and views to Unity Catalog.

Requirements

Upgrade a table to Unity Catalog

To upgrade a table to Unity Catalog as a managed table:

  1. If necessary, create a metastore. See Create a Unity Catalog metastore.

  2. Assign that metastore to the workspace that contains the table.

  3. Create a new Unity Catalog table by querying the existing table. Replace the placeholder values:

    • <catalog>: The Unity Catalog catalog for the new table.

    • <new_schema>: The Unity Catalog schema for the new table.

    • <new_table>: A name for the Unity Catalog table.

    • <old_schema>: The schema for the old table, such as default.

    • <old_table>: The name of the old table.

    CREATE TABLE <catalog>.<new_schema>.<new_table>
    AS SELECT * FROM hive_metastore.<old_schema>.<old_table>;
    
    df = spark.table("hive_metastore.<old_schema>.<old_table>")
    
    df.write.saveAsTable(
      name = "<catalog>.<new_schema>.<new_table>"
    )
    
    %r
    library(SparkR)
    
    df = tableToDF("hive_metastore.<old_schema>.<old_table>")
    
    saveAsTable(
      df = df,
      tableName = "<catalog>.<new_schema>.<new_table>"
    )
    
    val df = spark.table("hive_metastore.<old_schema>.<old_table>")
    
    df.write.saveAsTable(
      tableName = "<catalog>.<new_schema>.<new_table>"
    )
    

    If you want to migrate only some columns or rows, modify the SELECT statement.

    Note

    This command creates a managed table in which data is copied into the storage location that was nominated when the metastore was set up. To create an external table, where a table is registered in Unity Catalog without moving the data in cloud storage, see Upgrade an external table to Unity Catalog.

  4. Grant account-level users or groups access to the new table. See Manage access to data.

  5. After the table is migrated, users should update their existing queries and workloads to use the new table.

  6. Before you drop the old table, test for dependencies by revoking access to it and re-running related queries and workloads.

Upgrade an external table to Unity Catalog

You can copy an external table from your default Hive metastore to the Unity Catalog metastore using the Data Explorer.

Requirements

Before you begin, you must have:

  • A storage credential with an IAM role that authorizes Unity Catalog to access the tables’ location path.

  • An external location that references the storage credential you just created and the path to the data on your cloud tenant.

  • CREATE TABLE permission on the external location of the table to be upgraded.

Upgrade process

To upgrade an external table:

  1. If you are not already in Databricks SQL, use the persona switcher in the sidebar to select SQL.

  2. Click Data Icon Data in the sidebar to open the Data Explorer.

  3. Select the database, then the table, that you want to upgrade.

  4. Click Upgrade action in the top-right corner of the table detail view.

  5. Select your destination catalog and database in Unity Catalog, then click Upgrade.

  6. The table metadata has been copied to Unity Catalog, and a new table has been created. You can now define fine-grained access control in the Permissions tab.

  7. Modify workloads to use the new table.

    Note

    If you no longer need the old table, you can drop it from the Hive Metastore. Dropping an external table does not modify the data files on your cloud tenant.

Upgrade a view to Unity Catalog

After you upgrade all of a view’s referenced tables to the same Unity Catalog metastore, you can create a new view that references the new tables.

Note

If your view also references other views, upgrade those views first.

After you upgrade the view, grant access to it to account-level users and groups.

Before you drop the old view, test for dependencies by revoking access to it and re-running related queries and workloads.

Upgrade a schema or multiple tables to Unity Catalog

You can copy complete schemas (databases) and multiple tables from your default Hive metastore to the Unity Catalog metastore using the Data Explorer upgrade wizard.

Requirements

Before you begin, you must have:

  • A storage credential with an IAM role that authorizes Unity Catalog to access the tables’ location path.

  • An external location that references the storage credential you just created and the path to the data on your cloud tenant.

  • CREATE TABLE permission on the external locations of the tables to be upgraded.

Upgrade process

  1. Click Data Icon Data in the sidebar to open the Data Explorer.

  2. Select hive_metastore as your catalog and select the schema (database) that you want to upgrade.

    Select database
  3. Click Upgrade at the top right of the schema detail view.

  4. Select all of the tables that you want to upgrade and click Next.

    Only external tables in formats supported by Unity Catalog can be upgraded using the upgrade wizard.

  5. Set the destination catalog and schema for each table.

    You will be able to access the newly created table in the context of that catalog and schema. Destination catalog and schema can be set either for each table individually or in bulk. To set them in bulk, first select some or all tables and then set the destination catalog and schema.

  6. Review the table configurations. To modify them, click the Previous button.

  7. Click Create Query for Upgrade.

    A query editor appears with generated SQL statements.

  8. Run the query.

    When the query is done, each table’s metadata has been copied from Hive metastore to Unity Catalog. These tables are marked as upgraded in the upgrade wizard.

  9. Define fine-grained access control using the Permissions tab of each new table.

  10. Modify your workloads to use the new table.

Note

If you no longer need the old tables, you can drop them from the Hive Metastore. Dropping an external table does not modify the data files on your cloud tenant.