Skip to main content

Customer-managed keys for Unity Catalog

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

note

This feature requires the Enterprise tier.

Customer-managed keys (CMK) for Unity Catalog let you protect data managed by Databricks with your own encryption keys. You can configure encryption at the catalog level, using a separate key for each catalog based on data sensitivity or compliance requirements.

For information about CMK for managed services and workspace storage, see Customer-managed keys for managed services.

tc

What is CMK for Unity Catalog?

CMK for Unity Catalog lets you protect data in Unity Catalog catalogs backed by default storage with multi-key protection using your own encryption keys from Google Cloud Key Management Service (KMS).

With CMK, Databricks encrypts all data at rest by default using managed keys. For granular control, CMK lets you configure a separate customer-managed key for specific catalogs. To deny data access, you can revoke the key in Google Cloud KMS.

Benefits of CMK for Unity Catalog

  • Granular encryption control: Manage encryption at the catalog level, allowing different catalogs to use different encryption keys based on data sensitivity or compliance requirements.
  • Multi-key protection: CMK secures your data against access at the storage layer. Data can only be accessed on authorized workspaces based on fine-grained Unity Catalog policies.
  • Compliance and audit: Meet regulatory requirements for customer-controlled encryption keys and maintain audit trails of key access and usage.
  • Key revocation: Revoke access to the CMK in Google Cloud KMS to retain full ownership over your data.
  • Centralized key management: Manage all encryption keys through Google Cloud KMS, consistent with your existing GCP security practices.

How CMK for Unity Catalog works

CMK for Unity Catalog on GCP uses Google Cloud KMS keys, Databricks-managed service accounts, and catalog-level encryption settings to enforce customer-controlled encryption. The following components are central to CMK for Unity Catalog on GCP:

  • Google Cloud KMS keys: You create and manage encryption keys in Google Cloud Key Management Service. These keys serve as the root encryption keys for Unity Catalog catalogs.
  • CMK configurations: You create CMK configurations in the Databricks account console to register your Google Cloud KMS keys with Databricks. CMK configurations are account-level objects that must be created before you can apply CMK to a catalog. When you create a CMK configuration, Databricks automatically provisions a service account in the Databricks Google Cloud project and grants it access to your KMS key.
  • Service accounts: Databricks creates a service account in the Databricks Google Cloud account that is authorized to access your Cloud KMS key. This service account has the format db-cmk-{id}@databricks-<REGION_SUFFIX>.iam.gserviceaccount.com. Databricks automatically grants the service account the IAM permissions required for encryption and decryption.
    • Databricks generates a unique, least-privilege service account for each CMK instance to maintain security isolation and prevent the service account from inheriting unnecessary permissions.
  • Catalog-level encryption: You configure encryption directly on individual catalogs using Catalog Explorer or the Unity Catalog API. When you create or update a catalog with CMK settings, Databricks encrypts all data written to that catalog using your customer-managed key. This applies only to catalogs backed by default storage.
  • Dynamic enforcement: When data is written to a CMK-protected catalog, Databricks uses your KMS key to encrypt the data. When data is read, Databricks requests decryption from Google Cloud KMS. If you revoke Databricks access to the key, decryption fails and data becomes inaccessible.

Limitations

  • You can only configure this feature using the Databricks account console or REST API. Terraform support isn't available.
  • This feature only applies to catalogs backed by default storage. It doesn't apply to catalogs with external storage locations.

Prerequisites

Before you configure CMK for Unity Catalog on GCP, verify that you have the following:

  • Account administrator permissions: You must be a Databricks account administrator to create CMK configurations in the account console.

  • GCP IAM permissions: You need the following Google Cloud IAM permissions:

    • cloudkms.cryptoKeys.setIamPolicy - Required to grant the Databricks service account access to your KMS key
    • cloudkms.cryptoKeys.getIamPolicy - Required to verify service account permissions on your KMS key
  • Google Cloud KMS key: You must have an existing Cloud KMS key in your Google Cloud account. Follow the Google Cloud KMS quickstart guide to create a key if needed. This key must be in an active state in a region that supports your Databricks workspace. Copy the resource ID of your KMS key, which has the format: projects/{project}/locations/{location}/keyRings/{keyRing}/cryptoKeys/{cryptoKey}.

  • Unity Catalog permissions: To create or update catalogs with CMK, you must have CREATE CATALOG and USE CATALOG privileges in Unity Catalog.

Configure CMK for Unity Catalog

Follow these steps to configure customer-managed keys for Unity Catalog catalogs on GCP.

Step 1: Create a CMK configuration in the account console

Permissions required: Account administrator

Creating a CMK configuration registers your Google Cloud KMS key with Databricks and provisions a service account that Databricks uses to access the key.

  1. In the Databricks account console, go to Security > Encryption keys.

  2. Click Add Encryption Key.

  3. Configure the encryption key settings:

    • Name: Enter a descriptive name for your CMK configuration, such as finance-catalog-cmk or pii-data-cmk.
    • Cloud provider: Select Google Cloud Platform.
    • Use case: Choose either Managed services or Both managed services and workspace storage.
    • Key resource ID: Enter your Cloud KMS key resource ID in the format projects/{project}/locations/{location}/keyRings/{keyRing}/cryptoKeys/{cryptoKey}.

    When you click Add, Databricks creates a new service account in the Databricks Google Cloud account and automatically grants it Cloud KMS CryptoKey Encrypter/Decrypter permission on your KMS key.

  4. Click Add to create the CMK configuration.

  5. Copy the CMK configuration ID from the account console. You use this ID when you create or update catalogs.

Step 2: Verify service account authorization

After you create the CMK configuration, verify that the Databricks service account has the correct permissions on your Cloud KMS key.

  1. Retrieve the Service Account ID by calling the GetCustomerManagedKey API:

    Bash
    curl -X GET \
    -H "Authorization: Bearer <DATABRICKS_TOKEN>" \
    https://accounts.cloud.databricks.com/api/2.0/accounts/<ACCOUNT_ID>/customer-managed-keys/<CUSTOMER_MANAGED_KEY_ID>

    Replace the following values:

    • <DATABRICKS_TOKEN>: Your Databricks account administrator personal access token
    • <ACCOUNT_ID>: Your Databricks account ID
    • <CUSTOMER_MANAGED_KEY_ID>: The CMK configuration ID from Step 1

    The response includes the service account in the format db-cmk-{id}@databricks-<REGION_SUFFIX>.iam.gserviceaccount.com. The service account string is dynamically generated and must be copied exactly from the API response for verification.

  2. In the Google Cloud Console, go to your Cloud KMS key.

  3. Click the Permissions tab.

  4. Verify that the service account from step 1 is listed with the Cloud KMS CryptoKey Encrypter/Decrypter role.

Step 3: Create a new catalog with CMK

Permissions required: CREATE CATALOG in Unity Catalog

To create a new catalog with CMK protection, use the Unity Catalog API:

Bash
curl -X POST \
-H "Authorization: Bearer <api_token>" \
-H "Content-Type: application/json" \
https://<workspace_url>/api/2.1/unity-catalog/catalogs \
-d '{
"name": "<catalog_name>",
"comment": "Catalog with customer-managed encryption",
"storage_mode": "DEFAULT_STORAGE",
"encryption_settings": {
"customer_managed_key_id": "<cmk-id>"
}
}'

Replace the following values:

  • <workspace_url>: Your Databricks workspace URL (for example, https://dbc-1234567-a8b9.cloud.databricks.com)
  • <api_token>: Your Databricks personal access token
  • <catalog_name>: The name for your new catalog (for example, finance_data or customer_pii)
  • <cmk-id>: The CMK configuration ID from Step 1

Step 4: Update an existing catalog with CMK

Permissions required: MANAGE on the catalog or ownership of the catalog

To add or change CMK protection on an existing catalog that uses default storage:

  1. In Catalog Explorer, click the catalog name.
  2. Click the Details tab.
  3. Under Advanced, click Encryption settings.
  4. In the dialog, select your customer-managed key.
  5. Click Save.

You can change the key associated with a catalog at any time by repeating these steps. You can't disable CMK after it's enabled on a catalog.

important

When you add CMK to an existing catalog, Databricks encrypts only new data written to the catalog with your customer-managed key. Databricks-managed keys continue to encrypt existing data. To encrypt all data with your customer-managed key, you must rewrite the existing data.

Verify CMK configuration

To verify that your catalog is configured with CMK, use the Unity Catalog API to get the catalog details:

Bash
curl -X GET \
-H "Authorization: Bearer <api_token>" \
-H "Content-Type: application/json" \
"https://<workspace_url>/api/2.1/unity-catalog/catalogs/<catalog_name>"

The response includes the encryption_settings field for catalogs configured with CMK:

JSON
{
"name": "<catalog_name>",
"storage_mode": "DEFAULT_STORAGE",
"encryption_settings": {
"customer_managed_key_id": "<cmk-id>"
}
}

Revoke access to encrypted data

To deny Databricks access to data encrypted with your customer-managed key, disable your key in Google Cloud KMS:

  1. In the Google Cloud Console, go to your Cloud KMS key.
  2. Disable the key version.

After you disable the key, Databricks can no longer decrypt data in catalogs using this CMK configuration. Any attempts to read data from these catalogs fail with a decryption error.

There might be a delay between the time you disable the key and when data access is denied.

To restore access, re-enable the key version in Google Cloud KMS.