Skip to main content

Disable access to the Hive metastore used by your Databricks workspace

This article describes how to disable direct access to the legacy Hive metastore that is used by your Databricks workspace, whether the workspace-local Hive metastore or an external Hive metastore, including AWS Glue. When you've completed your Unity Catalog migration or federated your Hive metastore as a foreign catalog that is governed by Unity Catalog, you can use a simple workspace admin setting to prevent users from bypassing Unity Catalog and accessing tables registered in the Hive metastore.

Data in Hive metastore is not governed by Unity Catalog. Disabling direct Hive metastore access is an important step in the process of migrating to Unity Catalog and ensuring that you take full advantage of Unity Catalog data governance. You can disable direct access and continue to query tables managed by your Hive metastore by taking advantage of Hive metastore federation. You can federate Hive metastore tables either before or after you disable direct workspace access to the Hive metastore. See Migrating an existing workspace to Unity Catalog and Hive metastore federation: enable Unity Catalog to govern tables registered in a Hive metastore.

Databricks recommends that you disable direct access to Hive metastore for all clusters and workloads at once, but you can also use a Spark configuration to disable access on a cluster-by-cluster basis.

Before you begin: when should you disable the legacy metastore?

Before you disable the legacy Hive metastore, you should meet the following criteria:

  • You’re done migrating all tables registered in the legacy metastore to Unity Catalog, or you have always used Unity Catalog and never the legacy Hive metastore.
  • You want to force your users to stop using tables registered in the legacy metastore.
  • You have upgraded all jobs to Databricks Runtime 13.3 LTS or above.
  • An account admin has turned on Unity Catalog: Disable Legacy Features on the account console Previews page.

What happens when you disable the legacy metastore?

After you disable the legacy metastore:

  • Any jobs running against tables registered to the Hive metastore will fail.

  • Fallback is disabled.

  • Jobs that run on Databricks Runtime versions below 13.3 will fail.

    Currently running jobs will continue to work until they are terminated, but restarts on those clusters will fail.

  • The Legacy heading and hive_metastore catalog disappear from the Catalog Explorer browser pane.

  • SQL commands that attempt to show the contents of the hive_metastore catalog will fail.

note

Disabling legacy access does not prevent users from using cluster-level credentials, such as instance profiles or service principals, that are available on a cluster. Databricks recommends that you remove such credentials from your clusters.

No Isolation shared clusters do not respect the legacy Hive metastore disablement setting. To prevent users from creating and using such clusters, enable the Enforce User Isolation setting for the workspace. See Enforce user isolation cluster types on a workspace.

Disable all direct access to the Hive metastore

Disable your workspace's legacy Hive metastore using the Disable legacy access workspace admin setting:

  1. As a workspace admin, log in to your Databricks workspace.

  2. Click the user profile menu at the top right and select Settings from the menu.

  3. Go to Workspace admin > Security.

  4. Set Disable legacy access to Disabled: legacy access features cannot be used.

    note

    If this setting is missing, ask an account admin to turn on the Previews > Unity Catalog: Disable Legacy Features setting in the account console.

  5. To ensure that the new setting has taken effect, wait approximately five minutes.

  6. Restart all running clusters.

Disable access for individual compute clusters

You can disable direct access gradually, on a cluster-by-cluster basis. Skip the steps in the previous section and set the following Spark configuration on any non-serverless cluster:

spark.databricks.unityCatalogOnlyMode True

This approach can be useful during a Unity Catalog migration when you want to reduce reliance on Hive metastore incrementally until you can disable it for the entire workspace.

See Spark configuration.