Create a storage credential

Storage credentials are used to link permissions defined in your cloud account to managed privileges in Unity Catalog. Storage credentials grant Unity Catalog access to manage data stored in object storage within your cloud account. Databricks recommends that you use cloud storage permissions that have global privileges and restrict access as necessary using Unity Catalog.

To create a storage credential, you must be a Databricks account admin, a metastore admin, or a user with the CREATE STORAGE CREDENTIAL privilege. You need an IAM role that authorizes access (read, or read and write) to an S3 bucket path. You reference that IAM role when you create the storage credential.

The user who creates the storage credential can delegate ownership to another user or group to manage permissions on it.

Important

The name of the S3 bucket that you want users to read from and write to cannot use dot notation (for example, incorrect.bucket.name.notation). For more bucket naming guidance, see the AWS bucket naming rules.

Step 1: Create an IAM role

In AWS, create an IAM role that gives access to the S3 bucket that you want your users to access. This IAM role must be defined in the same account as the S3 bucket.

Tip

If you have already created an IAM role that provides this access, you can skip this step and go straight to Step 2: Give Databricks the IAM role details.

  1. Create an IAM role that will allow access to the S3 bucket.

    Role creation is a two-step process. In this step you create the role, adding a temporary trust relationship policy and a placeholder external ID that you then modify after creating the storage credential in Databricks.

    You must modify the trust policy after you create the role because your role must be self-assuming (that is, it must be configured to trust itself). The role must therefore exist before you add the self-assumption statement. For information about self-assuming roles, see this Amazon blog article.

    To create the policy, you must use a placeholder external ID. An external ID is required in AWS to grant access to your AWS resources to a third party.

    1. Create the IAM role with a Custom Trust Policy.

    2. In the Custom Trust Policy field, paste the following policy JSON.

      This policy establishes a cross-account trust relationship so that Unity Catalog can assume the role to access the data in the bucket on behalf of Databricks users. This is specified by the ARN in the Principal section. It is a static value that references a role created by Databricks. Do not modify it.

      The policy sets the external ID to 0000 as a placeholder. You update this to the external ID of your storage credential in a later step.

      {
        "Version": "2012-10-17",
        "Statement": [{
          "Effect": "Allow",
          "Principal": {
            "AWS": [
              "arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL"
            ]
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "0000"
            }
          }
        }]
      }
      
    3. Skip the permissions policy configuration. You’ll go back to add that in a later step.

    4. Save the IAM role.

  2. Create the following IAM policy in the same account as the S3 bucket, replacing the following values:

    • <BUCKET>: The name of the S3 bucket.

    • <KMS-KEY>: Optional. If encryption is enabled, provide the name of the KMS key that encrypts the S3 bucket contents. If encryption is disabled, remove the entire KMS section of the IAM policy.

    • <AWS-ACCOUNT-ID>: The Account ID of your AWS account (not your Databricks account).

    • <AWS-IAM-ROLE-NAME>: The name of the AWS IAM role that you created in the previous step.

    This IAM policy grants read and write access. You can also create a policy that grants read access only. However, this may be unnecessary, because you can mark the storage credential as read-only, and any write access granted by this IAM role will be ignored.

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Action": [
                  "s3:GetObject",
                  "s3:PutObject",
                  "s3:DeleteObject",
                  "s3:ListBucket",
                  "s3:GetBucketLocation"
              ],
              "Resource": [
                  "arn:aws:s3:::<BUCKET>/*",
                  "arn:aws:s3:::<BUCKET>"
              ],
              "Effect": "Allow"
          },
          {
              "Action": [
                  "kms:Decrypt",
                  "kms:Encrypt",
                  "kms:GenerateDataKey*"
              ],
              "Resource": [
                  "arn:aws:kms:<KMS-KEY>"
              ],
              "Effect": "Allow"
          },
          {
              "Action": [
                  "sts:AssumeRole"
              ],
              "Resource": [
                  "arn:aws:iam::<AWS-ACCOUNT-ID>:role/<AWS-IAM-ROLE-NAME>"
              ],
              "Effect": "Allow"
          }
        ]
    }
    

    Note

    If you need a more restrictive IAM policy for Unity Catalog, contact your Databricks account team for assistance.

  3. Attach the IAM policy to the IAM role.

    In the Role’s Permission tab, attach the IAM Policy you just created.

Step 2: Give Databricks the IAM role details

  1. In Databricks, log in to a workspace that is linked to the metastore.

    You must the CREATE STORAGE CREDENTIAL privilege. The metastore admin and account admin roles both include this privilege.

  2. Click Catalog icon Catalog.

  3. Click +Add > Add a storage credential.

  4. Enter a name for the credential, the IAM Role ARN that authorizes Unity Catalog to access the storage location on your cloud tenant, and an optional comment.

    Tip

    If you have already defined an instance profile in Databricks, you can click Copy instance profile to copy over the IAM role ARN for that instance profile. The instance profile’s IAM role must have a cross-account trust relationship that enables Databricks to assume the role in order to access the bucket on behalf of Databricks users. For more information about the IAM role policy and trust relationship requirements, see Step 1: Create an IAM role.

  5. (Optional) If you want users to have read-only access to the external locations that use this storage credential, in Advanced options select Read only. For more information, see Mark a storage credential as read-only.

  6. Click Create.

  7. In the Storage credential created dialog, copy the External ID.

  8. Create an external location that references this storage credential.

You can also create a storage credential by using Databricks Terraform provider and databricks_storage_credential.

Step 3: Update the IAM role policy

In AWS, modify the trust relationship policy to add your storage credential’s external ID and make it self-assuming.

  1. Return to your saved IAM role and go to the Trust Relationships tab.

  2. Edit the trust relationship policy as follows:

    Add the following ARN to the “Allow” statement. Replace <YOUR-AWS-ACCOUNT-ID> and <THIS-ROLE-NAME> with your actual account ID and IAM role values. To get your Databricks account ID, log in to the Databricks account console, click your username in the upper-right side, and copy the Account ID value on the menu.

    "arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
    

    In the "sts:AssumeRole" statement, update the placeholder external ID to your storage credential’s external ID that you copied in the previous step.

    "sts:ExternalId": "<STORAGE-CREDENTIAL-EXTERNAL-ID>"
    

    Your policy should now look like the following, with the replacement text updated to use your storage credential’s external ID, account ID, and IAM role values:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": [
              "arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL",
              "arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
            ]
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<STORAGE-CREDENTIAL-EXTERNAL-ID>"
            }
          }
        }
      ]
    }
    

Next steps

You can view, update, delete, and grant other users permission to use storage credentials. See Manage storage credentials.

You can define external locations using storage credentials. See Create an external location.