Configure customer-managed keys for encryption

Account admins can use the Databricks account console to configure customer-managed keys for encryption. You can also configure customer-managed keys using the Account Key Configurations API.

There are two Databricks use cases for adding a customer-managed key:

  • Workspace storage (your workspace’s root S3 buckets and the EBS volumes of compute resources in the classic compute plane).

Note

Customer-managed keys for EBS volumes, does not apply to serverless compute resources. Disks for serverless compute resources are short-lived and tied to the lifecycle of the serverless workload. When compute resources are stopped or scaled down, the VMs and their storage are destroyed.

To compare the customer-managed key use cases, see Compare customer-managed keys use cases.

For a list of regions that support customer-managed keys, see Databricks clouds and regions. This feature requires the Enterprise pricing tier.

What is an encryption keys configuration?

Customer-managed keys are managed with encryption keys configurations. Encryption keys configurations are account-level objects that reference your cloud’s key.

Account admins create encryption keys configurations in the account console and an encryption keys configuration can be attached to one or more workspaces.

You can share a Databricks key configuration object between the two different encryption use cases (managed services and workspace storage).

You can add an encryption keys configuration to your Databricks workspace during workspace creation or you can update an existing workspace with an encryption key configuration.

Step 1: Create or select a key in AWS KMS

You can use the same AWS KMS key between the workspace storage and managed services use cases.

  1. Create or select a symmetric key in AWS KMS, following the instructions in Creating symmetric CMKs or Viewing keys.

The KMS key must be in the same AWS region as your workspace.

  1. Copy these values, which you need in a later step:

    • Key ARN: Get the ARN from the console or the API (the Arn field in the JSON response).

    • Key alias: An alias specifies a display name for the CMK in AWS KMS.

  2. On the Key policy tab, switch to the policy view. Edit the key policy to add the below text so that Databricks can use the key to perform encryption and decryption operations.

    Select a tab for your encryption use case below and click Copy.

    Add the JSON to your key policy in the "Statement" section. Do not delete the existing key policies.

    The policy uses the Databricks AWS account ID 414351767826. If you are are using Databricks on AWS GovCloud use the Databricks account ID 044793339203.

    To allow Databricks to encrypt cluster EBS volumes, replace the <cross-account-iam-role-arn> in the policy with the ARN for the cross-cloud IAM role that you created to allow Databricks to access your account. This is the same Role ARN that you use to register a Databricks credential configuration for a Databricks workspace.

    {
        "Sid": "Allow Databricks to use KMS key for DBFS",
        "Effect": "Allow",
        "Principal":{
          "AWS":"arn:aws:iam::414351767826:root"
        },
        "Action": [
          "kms:Encrypt",
          "kms:Decrypt",
          "kms:ReEncrypt*",
          "kms:GenerateDataKey*",
          "kms:DescribeKey"
        ],
        "Resource": "*",
        "Condition": {
          "StringEquals": {
            "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
          }
        }
      },
      {
        "Sid": "Allow Databricks to use KMS key for DBFS (Grants)",
        "Effect": "Allow",
        "Principal":{
          "AWS":"arn:aws:iam::414351767826:root"
        },
        "Action": [
          "kms:CreateGrant",
          "kms:ListGrants",
          "kms:RevokeGrant"
        ],
        "Resource": "*",
        "Condition": {
          "Bool": {
            "kms:GrantIsForAWSResource": "true"
          },
          "StringEquals": {
            "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
          }
        }
      },
    {
        "Sid": "Allow Databricks to use KMS key for managed services in the control plane",
        "Effect": "Allow",
        "Principal": {
          "AWS": "arn:aws:iam::414351767826:root"
        },
        "Action": [
          "kms:Encrypt",
          "kms:Decrypt"
        ],
        "Resource": "*",
        "Condition": {
          "StringEquals": {
            "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
          }
        }
      },
    {
        "Sid": "Allow Databricks to use KMS key for EBS",
        "Effect": "Allow",
        "Principal": {
          "AWS": "<cross-account-iam-role-arn>"
        },
        "Action": [
          "kms:Decrypt",
          "kms:GenerateDataKey*",
          "kms:CreateGrant",
          "kms:DescribeKey"
        ],
        "Resource": "*",
        "Condition": {
          "ForAnyValue:StringLike": {
            "kms:ViaService": "ec2.*.amazonaws.com"
          }
        }
      }
    
    {
      "Sid": "Allow Databricks to use KMS key for managed services in the control plane",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::414351767826:root"
      },
      "Action": [
        "kms:Encrypt",
        "kms:Decrypt"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
        }
      }
    }
    

    To allow Databricks to encrypt cluster EBS volumes, replace the <cross-account-iam-role-arn> in the policy with the ARN for the cross-cloud IAM role that you created to allow Databricks to access your account. This is the same Role ARN that you use to register a Databricks credential configuration for a Databricks workspace.

    {
        "Sid": "Allow Databricks to use KMS key for DBFS",
        "Effect": "Allow",
        "Principal":{
          "AWS":"arn:aws:iam::414351767826:root"
        },
        "Action": [
          "kms:Encrypt",
          "kms:Decrypt",
          "kms:ReEncrypt*",
          "kms:GenerateDataKey*",
          "kms:DescribeKey"
        ],
        "Resource": "*",
        "Condition": {
          "StringEquals": {
            "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
          }
        }
      },
      {
        "Sid": "Allow Databricks to use KMS key for DBFS (Grants)",
        "Effect": "Allow",
        "Principal":{
          "AWS":"arn:aws:iam::414351767826:root"
        },
        "Action": [
          "kms:CreateGrant",
          "kms:ListGrants",
          "kms:RevokeGrant"
        ],
        "Resource": "*",
        "Condition": {
          "Bool": {
            "kms:GrantIsForAWSResource": "true"
          },
          "StringEquals": {
            "aws:PrincipalTag/DatabricksAccountId": ["<databricks-account-id>(s)"]
          }
        }
      },
    {
        "Sid": "Allow Databricks to use KMS key for EBS",
        "Effect": "Allow",
        "Principal": {
          "AWS": "<cross-account-iam-role-arn>"
        },
        "Action": [
          "kms:Decrypt",
          "kms:GenerateDataKey*",
          "kms:CreateGrant",
          "kms:DescribeKey"
        ],
        "Resource": "*",
        "Condition": {
          "ForAnyValue:StringLike": {
            "kms:ViaService": "ec2.*.amazonaws.com"
          }
        }
      }
    

    Note

    To retrieve your Databricks account ID, follow Locate your account ID.

Step 2: Add an access policy to your cross-account IAM role (Optional)

If your KMS key is in a different AWS account than the cross-account IAM role used to deploy your workspace, then you must add a policy to that cross-account IAM role. This policy enables Databricks to access your key. If your KMS key is in the same AWS account as the cross-account IAM role used to deploy your workspace, then you do not need to do this step.

  1. Log into the AWS Management Console as a user with administrator privileges and go to the IAM console.

  2. In the left navigation pane, click Roles.

  3. In the list of roles, click the cross-account IAM role that you created for Databricks.

  4. Add an inline policy.

    1. On the Permissions tab, click Add inline policy.

      Inline policy
    2. In the policy editor, click the JSON tab.

      JSON editor
    3. Copy the access policy below

      {
        "Sid": "AllowUseOfCMKInAccount <AccountIdOfCrossAccountIAMRole>",
        "Effect": "Allow",
        "Action": [
          "kms:Decrypt",
          "kms:GenerateDataKey*",
          "kms:CreateGrant",
          "kms:DescribeKey"
        ],
        "Resource": "arn:aws:kms:<region>:<AccountIdOfKMSKey>:key/<KMSKeyId>",
        "Condition": {
          "ForAnyValue:StringLike": {
            "kms:ViaService": "ec2.*.amazonaws.com"
          }
        }
      }
      
    4. Click Review policy.

    5. In the Name field, enter a policy name.

    6. Click Create policy.

Step 3: Create a new key configuration

Create a Databricks encryption key configuration object using the Databricks account console. You can use an encryption key configuration across multiple workspaces.

  1. As an account admin, log in to the account console.

  2. In the sidebar, click Cloud resources.

  3. Click the Encryption keys configuration tab.

  4. Click Add encryption key.

  5. Select the use cases for this encryption key:

    • Both managed services and workspace storage

    • Managed services

    • Workspace storage

  6. In the AWS key ARN field, enter the key ARN that you copied above.

  7. In the AWS key alias field, enter the key alias that you copied above.

  8. Click Add.

  9. Copy the Name.

Step 4: Add the key configuration to a workspace

Add the encryption key configuration that you created to a workspace. You cannot add the encryption key to a workspace using the account console. This section uses the Databricks CLI to add an encryption key to a workspace. You can also use the Account API.

To create a new workspace using the encryption key configuration, follow the instructions in Create a workspace using the Account API.

  1. Terminate all running compute in your workspace.

  2. Update a workspace with your key configuration.

    To add the key for managed services, set managed_services_customer_managed_key_id to the key name that you copied above.

    To add the key for workspace storage, set storage-customer-managed-key-id to the key name that you copied above.

    Replace <workspace-id> with your Databricks workspace ID.

    For example:

    databricks account workspaces update <workspace-id> --json '{
      "managed_services_customer_managed_key_id": "<databricks-key-name>",
      "storage-customer-managed-key-id": "<databricks-key-name>",
    }'
    
  3. If you are adding keys for workspace storage, wait at least 20 minutes to start any compute or use the DBFS API.

  4. Restart compute that you terminated in a previous step.

Rotate an existing key

You can only rotate (update) an existing key for customer-managed key for managed services. You cannot rotate an existing key for customer-managed key for storage. However, AWS provides automatic CMK master key rotation, which rotates the underlying key without changing the key ARN. Automatic CMK master key rotation is compatible with Databricks customer-managed keys for storage. For more information, see Rotating AWS KMS keys.

To rotate an existing key for managed services, follow the instruction in Step 4: Add the key configuration to a workspace. You must keep your old KMS key available to Databricks for 24 hours.