Customer-managed keys for managed services

Preview

This feature is in Public Preview.

To use customer-managed keys for managed services, the workspace must be on the E2 version of the Databricks platform or on a custom plan that has been enabled by Databricks for this feature. All new Databricks accounts and most existing accounts are now E2. If you are unsure which account type you have, contact your Databricks representative.

Overview

Security-conscious organizations have risk management processes that evaluate risks of public cloud use, SaaS applications, and third-party services. Reducing risk from third-party service providers helps build a strong case for using external services. Some regulated industries may require encryption of some types of data with keys that they manage. These considerations are especially important for sectors that regularly use personal data or other confidential information.

Workspace notebooks are primarily stored in the Databricks control plane, where they are accessed by the main Databricks services within the Databricks AWS account. The Databricks platform allows you to configure encryption of these notebooks with your own key. You must provide this key at the time you create your workspace.

Your managed services key is also used to encrypt:

Note

This feature does not encrypt data stored outside of the control plane. To encrypt data in your root S3 bucket and cluster EBS volumes, see the related feature Customer-managed keys for workspace storage.

Important

Workspace data plane VPCs can be in AWS regions ap-northeast-1, ap-south-1, ap-southeast-2, ca-central-1, eu-west-1, eu-west-2, eu-central-1, us-east-1, us-east-2, us-west-1, and us-west-2. However, you cannot use a VPC in us-west-1 if you want to use customer-managed keys to encrypt managed services or workspace storage.

You can optionally share this key and its Databricks key configuration object (which references your key) between two different encryption use cases:

There are important differences in when you can add the keys for these two use cases:

  • For customer-managed keys for managed services, you must add the key and its Databricks key configuration that references it to your Databricks workspace during workspace creation.
  • For customer-managed keys for storage, you can add the key and the key configuration to your Databricks workspace during workspace creation but you can also add it to a running workspace.

Important

In both cases, once a key has been added, you cannot change (rotate) the key to a different value.

How encryption works for managed services in the control plane

A customer-managed key encrypts the workspace’s managed services data in the control plane, including notebooks, secrets, Databricks SQL queries, and Databricks SQL query history. Customers provide a secret revocable key called a customer-managed key (CMK), which is specified by its ID in the cloud service’s key management system. In AWS, customer keys are managed by AWS Key Management Service (KMS).

Additionally, Databricks creates a Databricks-managed key (DMK) for each workspace. The DMK and the CMK are used to jointly wrap or unwrap the data encryption key (DEK). Databricks uses the DEK to encrypt the workspace’s managed services data.

The DEK is cached in memory for several read and write operations and evicted from memory at a regular interval such that new requests require another request to your cloud service’s key management system. If you delete or revoke your key, reading or writing to notebooks or other managed services data fails at the end of the cache time interval.

Customer-managed keys work for managed services

Add a customer-managed key for managed services to a new workspace

To add a customer-managed key for managed services, you must add the key when you create a workspace using the Account API.

Create a key

To configure your customer-managed key:

  1. Create or select a symmetric key in AWS KMS, following the instructions in Creating symmetric CMKs or Viewing keys.

  2. Copy these values. You will use them when you create the workspace:

    • Key ARN — Get the ARN from the console or the API (the Arn field in the JSON response).
    • Key alias — An alias specifies a display name for the customer-managed key in AWS KMS. Use an alias to identify a customer-managed key in cryptographic operations. For more information, see the AWS documentation: AWS::KMS::Alias and Working with aliases.
  3. On the Key policy tab, switch to the policy view. Edit the key policy so that Databricks can use the key to perform encryption and decryption operations. Add the following to the key policy "Statement":

    {
      "Sid": "Allow Databricks to use KMS key for managed services in the control plane",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::414351767826:root"
      },
      "Action": [
        "kms:Encrypt",
        "kms:Decrypt"
      ],
      "Resource": "*"
    }
    

    For more information, see the AWS article Editing keys.

Register a key for a new workspace

To register the key, follow the instructions in Create a new workspace using the Account API, specifically Step 5: Configure customer-managed keys (optional). Those instructions show how to optionally share this key for encrypting workspace storage.

Important

You must add your key for customer-managed keys for managed services during workspace creation only. Note that this is different from encrypting workspace storage, which supports adding a key to a running workspace.