Step 1: Create an instance profile
This article explains how to create an instance profile using the AWS console.
You can deploy compute resources with instance profiles for secure access to data stored in S3. Instance profiles are AWS IAM roles associated with EC2 instances.
Administrators configure IAM roles in AWS, link them to the Databricks workspace, and grant access to privileged users to associate instance profiles with compute. All users that have access to compute resources with an instance profile attached to it gain the privileges granted by the instance profile.
Note
Databricks recommends using Unity Catalog external locations to connect to S3 instead of instance profiles. Unity Catalog simplifies security and governance of your data by providing a central place to administer and audit data access across multiple workspaces in your account. See What is Unity Catalog? and Connect to S3 with Unity Catalog.
Use the AWS console to create an instance profile
In the AWS console, go to the IAM service.
Click the Roles tab in the sidebar.
Click Create role.
Under Trusted entity type, select AWS service.
Under Use case, select EC2.
Click Next.
At the bottom of the page, click Next.
In the Role name field, type a role name.
Click Create role.
In the role list, click the role.
Add an inline policy to the role. This policy grants access to the S3 bucket.
In the Permissions tab, click Add permissions > Create inline policy.
Click the JSON tab.
Copy this policy and set
<s3-bucket-name>
to the name of your bucket.{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::<s3-bucket-name>" ] }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:PutObjectAcl" ], "Resource": [ "arn:aws:s3:::<s3-bucket-name>/*" ] } ] }
Click Review policy.
In the Name field, type a policy name.
Click Create policy.
In the role summary, copy the Role ARN.
Note
If you intend to enable encryption for the S3 bucket, you must add the IAM role as a Key User for the KMS key provided in the configuration. See Configure encryption for S3 with KMS.
Enable the policy to work with serverless resources
This step ensures that your instance profile will work if you choose to configure Databricks SQL to use this instance profile. By configuring this policy, your instance profile can use serverless, pro, or classic SQL warehouses.
In the role list, click your instance profile.
Select the Trust Relationships tab.
Click Edit Trust Policy.
Within the existing
Statement
array, append the following JSON block to the end of the existing trust policy. Ensure that you don’t overwrite the existing policy.{ "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::790110701330:role/serverless-customer-resource-role" ] }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": [ "databricks-serverless-<YOUR_WORKSPACE_ID1>", "databricks-serverless-<YOUR_WORKSPACE_ID2>" ] } } }
The only thing you need to change in the statement is the workspace IDs. Replace the
YOUR_WORKSPACE-ID
s with one or more Databricks workspace IDs for the workspaces that will use this role.Note
To get your workspace ID, check the URL when you’re using your workspace. For example, in
https://<databricks-instance>/?o=6280049833385130
, the number aftero=
is the workspace ID.Do not edit the principal of the policy. The
Principal.AWS
field must continue to have the valuearn:aws:iam::790110701330:role/serverless-customer-resource-role
. This references a serverless compute role managed by Databricks.Click Review policy.
Click Save changes.
Next steps
After you create an instance profile, you need to create the S3 bucket policy. See Step 2: Create a bucket policy.