Create a workspace with manual AWS configurations
This page explains how to deploy a classic workspace using manually created AWS resources. Use this method if you want to create your own AWS resources or need to deploy a workspace with custom configurations such as your own VPC, specific IAM policies, or pre-existing S3 buckets.
For most deployments, Databricks recommends using automated configuration, which uses AWS IAM temporary delegation to automatically provision all required resources.
Requirements
To create a workspace with manual configuration, you must:
- Be an account admin in your Databricks account.
- Have permissions to provision IAM roles, S3 buckets, and access policies in your AWS account.
- Have an available VPC and NAT gateway in your AWS account in the workspace's region. You can view your available quotas and request increases using the AWS Service Quotas console.
- Have the STS endpoint activated for
us-west-2. For details, see the AWS documentation.
Create a Databricks workspace with manual AWS configurations
To create a workspace with manually configured AWS resources:
- Go to the account console and click the Workspaces icon.
- Click Create Workspace.
- In the Workspace name field, enter a human-readable name for this workspace. It can contain spaces.
- In the Region field, select an AWS region for your workspace's network and compute resources.
- In the Cloud credentials dropdown, select or create a credential configuration. If you create a new credential configuration, see Create a credential configuration.
- In the Cloud storage dropdown, select or create the storage configuration you'll use for this workspace. If you create a new storage configuration, see Create a storage configuration.
- (Optional) Set up any Advanced configurations. See Advanced configurations.
- Click Next.
- Review your workspace details and click Create workspace.
Create a credential configuration
The credential configuration gives Databricks access to launch compute resources in your AWS account. This step requires you to create a new cross-account IAM role with an access policy.
Step 1: Create a cross-account IAM role
- Get your Databricks account ID. See Locate your account ID.
- Log into your AWS Console as a user with administrator privileges and go to the IAM console.
- Click the Roles tab in the sidebar.
- Click Create role.
- In Select type of trusted entity, click the AWS account tile.
- Select the Another AWS account checkbox.
- In the Account ID field, enter the Databricks account ID
414351767826. This is not the Account ID you copied from the Databricks account console. If you are are using Databricks on AWS GovCloud use the Databricks account ID044793339203for AWS GovCloud or170661010020for AWS GovCloud DoD. - Select the Require external ID checkbox.
- In the External ID field, enter your Databricks account ID, which you copied from the Databricks account console.
- Click the Next button.
- On the Add Permissions page, click the Next button. You should now be on the Name, review, and create page.
- In the Role name field, enter a role name.
- Click Create role. The list of roles appears.
Step 2: Create an access policy
The access policy you add to the role depends on your Amazon VPC (Virtual Private Cloud) deployment type. For information about how Databricks uses each permission, see IAM permissions for Databricks-managed VPCs.
- In the Roles section of the IAM console, click the IAM role you created in Step 1.
- Click the Add permissions drop-down and select Create inline policy.
- In the policy editor, click the JSON tab.
- Copy and paste the appropriate access policy for your deployment type:
- Databricks-managed VPC
- Customer-managed VPC with default restrictions
- Customer-managed VPC with custom restrictions
A single VPC that Databricks creates and configures in your AWS account.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1403287045000",
"Effect": "Allow",
"Action": [
"ec2:AllocateAddress",
"ec2:AssignPrivateIpAddresses",
"ec2:AssociateDhcpOptions",
"ec2:AssociateIamInstanceProfile",
"ec2:AssociateRouteTable",
"ec2:AttachInternetGateway",
"ec2:AttachVolume",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CancelSpotInstanceRequests",
"ec2:CreateDhcpOptions",
"ec2:CreateFleet",
"ec2:CreateInternetGateway",
"ec2:CreateLaunchTemplate",
"ec2:CreateLaunchTemplateVersion",
"ec2:CreateNatGateway",
"ec2:CreateRoute",
"ec2:CreateRouteTable",
"ec2:CreateSecurityGroup",
"ec2:CreateSubnet",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateVpc",
"ec2:CreateVpcEndpoint",
"ec2:DeleteDhcpOptions",
"ec2:DeleteFleets",
"ec2:DeleteInternetGateway",
"ec2:DeleteLaunchTemplate",
"ec2:DeleteLaunchTemplateVersions",
"ec2:DeleteNatGateway",
"ec2:DeleteRoute",
"ec2:DeleteRouteTable",
"ec2:DeleteSecurityGroup",
"ec2:DeleteSubnet",
"ec2:DeleteTags",
"ec2:DeleteVolume",
"ec2:DeleteVpc",
"ec2:DeleteVpcEndpoints",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeFleetHistory",
"ec2:DescribeFleetInstances",
"ec2:DescribeFleets",
"ec2:DescribeIamInstanceProfileAssociations",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances",
"ec2:DescribeInternetGateways",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeLaunchTemplateVersions",
"ec2:DescribeNatGateways",
"ec2:DescribePrefixLists",
"ec2:DescribeReservedInstancesOfferings",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:DescribeVpcs",
"ec2:DetachInternetGateway",
"ec2:DisassociateIamInstanceProfile",
"ec2:DisassociateRouteTable",
"ec2:GetLaunchTemplateData",
"ec2:GetSpotPlacementScores",
"ec2:ModifyFleet",
"ec2:ModifyLaunchTemplate",
"ec2:ModifyVpcAttribute",
"ec2:ReleaseAddress",
"ec2:ReplaceIamInstanceProfileAssociation",
"ec2:RequestSpotInstances",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
"ec2:RunInstances",
"ec2:TerminateInstances"
],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["iam:CreateServiceLinkedRole", "iam:PutRolePolicy"],
"Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
"Condition": {
"StringLike": {
"iam:AWSServiceName": "spot.amazonaws.com"
}
}
}
]
}
Create your Databricks workspaces in your own VPC, using a feature known as customer-managed VPC.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1403287045000",
"Effect": "Allow",
"Action": [
"ec2:AssociateIamInstanceProfile",
"ec2:AttachVolume",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CancelSpotInstanceRequests",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:DeleteTags",
"ec2:DeleteVolume",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeIamInstanceProfileAssociations",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances",
"ec2:DescribeInternetGateways",
"ec2:DescribeNatGateways",
"ec2:DescribeNetworkAcls",
"ec2:DescribePrefixLists",
"ec2:DescribeReservedInstancesOfferings",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:DescribeVpcAttribute",
"ec2:DescribeVpcs",
"ec2:DetachVolume",
"ec2:DisassociateIamInstanceProfile",
"ec2:ReplaceIamInstanceProfileAssociation",
"ec2:RequestSpotInstances",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress",
"ec2:RunInstances",
"ec2:TerminateInstances",
"ec2:DescribeFleetHistory",
"ec2:ModifyFleet",
"ec2:DeleteFleets",
"ec2:DescribeFleetInstances",
"ec2:DescribeFleets",
"ec2:CreateFleet",
"ec2:DeleteLaunchTemplate",
"ec2:GetLaunchTemplateData",
"ec2:CreateLaunchTemplate",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeLaunchTemplateVersions",
"ec2:ModifyLaunchTemplate",
"ec2:DeleteLaunchTemplateVersions",
"ec2:CreateLaunchTemplateVersion",
"ec2:AssignPrivateIpAddresses",
"ec2:GetSpotPlacementScores"
],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["iam:CreateServiceLinkedRole", "iam:PutRolePolicy"],
"Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
"Condition": {
"StringLike": {
"iam:AWSServiceName": "spot.amazonaws.com"
}
}
}
]
}
Create your Databricks workspaces in your own VPC with custom restrictions for account ID, VPC ID, AWS Region, and security group.
Replace the following values in the policy with your own configuration values:
ACCOUNTID— Your AWS account ID, which is a number.VPCID— ID of the AWS VPC where you want to launch workspaces.REGION— AWS Region name for your VPC deployment, for exampleus-west-2.SECURITYGROUPID— ID of your AWS security group. When you add a security group restriction, you cannot reuse the cross-account IAM role or reference a credentials ID (credentials_id) for another workspace. You must create separate roles, policies, and credentials objects for each workspace.
If you have custom requirements configured for security groups with your customer-managed vpc, contact your Databricks account team for assistance with IAM policy customizations.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "NonResourceBasedPermissions",
"Effect": "Allow",
"Action": [
"ec2:AssignPrivateIpAddresses",
"ec2:CancelSpotInstanceRequests",
"ec2:DescribeAvailabilityZones",
"ec2:DescribeIamInstanceProfileAssociations",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances",
"ec2:DescribeInternetGateways",
"ec2:DescribeNatGateways",
"ec2:DescribeNetworkAcls",
"ec2:DescribePrefixLists",
"ec2:DescribeReservedInstancesOfferings",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSpotInstanceRequests",
"ec2:DescribeSpotPriceHistory",
"ec2:DescribeSubnets",
"ec2:DescribeVolumes",
"ec2:DescribeVpcAttribute",
"ec2:DescribeVpcs",
"ec2:CreateTags",
"ec2:DeleteTags",
"ec2:GetSpotPlacementScores",
"ec2:RequestSpotInstances",
"ec2:DescribeFleetHistory",
"ec2:ModifyFleet",
"ec2:DeleteFleets",
"ec2:DescribeFleetInstances",
"ec2:DescribeFleets",
"ec2:CreateFleet",
"ec2:DeleteLaunchTemplate",
"ec2:GetLaunchTemplateData",
"ec2:CreateLaunchTemplate",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeLaunchTemplateVersions",
"ec2:ModifyLaunchTemplate",
"ec2:DeleteLaunchTemplateVersions",
"ec2:CreateLaunchTemplateVersion"
],
"Resource": ["*"]
},
{
"Sid": "InstancePoolsSupport",
"Effect": "Allow",
"Action": [
"ec2:AssociateIamInstanceProfile",
"ec2:DisassociateIamInstanceProfile",
"ec2:ReplaceIamInstanceProfileAssociation"
],
"Resource": "arn:aws:ec2:REGION:ACCOUNTID:instance/*",
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Vendor": "Databricks"
}
}
},
{
"Sid": "AllowEc2RunInstancePerTag",
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:volume/*", "arn:aws:ec2:REGION:ACCOUNTID:instance/*"],
"Condition": {
"StringEquals": {
"aws:RequestTag/Vendor": "Databricks"
}
}
},
{
"Sid": "AllowEc2RunInstanceImagePerTag",
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:image/*"],
"Condition": {
"StringEquals": {
"aws:ResourceTag/Vendor": "Databricks"
}
}
},
{
"Sid": "AllowEc2RunInstancePerVPCid",
"Effect": "Allow",
"Action": "ec2:RunInstances",
"Resource": [
"arn:aws:ec2:REGION:ACCOUNTID:network-interface/*",
"arn:aws:ec2:REGION:ACCOUNTID:subnet/*",
"arn:aws:ec2:REGION:ACCOUNTID:security-group/*"
],
"Condition": {
"StringEquals": {
"ec2:vpc": "arn:aws:ec2:REGION:ACCOUNTID:vpc/VPCID"
}
}
},
{
"Sid": "AllowEc2RunInstanceOtherResources",
"Effect": "Allow",
"Action": "ec2:RunInstances",
"NotResource": [
"arn:aws:ec2:REGION:ACCOUNTID:image/*",
"arn:aws:ec2:REGION:ACCOUNTID:network-interface/*",
"arn:aws:ec2:REGION:ACCOUNTID:subnet/*",
"arn:aws:ec2:REGION:ACCOUNTID:security-group/*",
"arn:aws:ec2:REGION:ACCOUNTID:volume/*",
"arn:aws:ec2:REGION:ACCOUNTID:instance/*"
]
},
{
"Sid": "EC2TerminateInstancesTag",
"Effect": "Allow",
"Action": ["ec2:TerminateInstances"],
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:instance/*"],
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Vendor": "Databricks"
}
}
},
{
"Sid": "EC2AttachDetachVolumeTag",
"Effect": "Allow",
"Action": ["ec2:AttachVolume", "ec2:DetachVolume"],
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:instance/*", "arn:aws:ec2:REGION:ACCOUNTID:volume/*"],
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Vendor": "Databricks"
}
}
},
{
"Sid": "EC2CreateVolumeByTag",
"Effect": "Allow",
"Action": ["ec2:CreateVolume"],
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:volume/*"],
"Condition": {
"StringEquals": {
"aws:RequestTag/Vendor": "Databricks"
}
}
},
{
"Sid": "EC2DeleteVolumeByTag",
"Effect": "Allow",
"Action": ["ec2:DeleteVolume"],
"Resource": ["arn:aws:ec2:REGION:ACCOUNTID:volume/*"],
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Vendor": "Databricks"
}
}
},
{
"Effect": "Allow",
"Action": ["iam:CreateServiceLinkedRole", "iam:PutRolePolicy"],
"Resource": "arn:aws:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot",
"Condition": {
"StringLike": {
"iam:AWSServiceName": "spot.amazonaws.com"
}
}
},
{
"Sid": "VpcNonresourceSpecificActions",
"Effect": "Allow",
"Action": [
"ec2:AuthorizeSecurityGroupEgress",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:RevokeSecurityGroupEgress",
"ec2:RevokeSecurityGroupIngress"
],
"Resource": "arn:aws:ec2:REGION:ACCOUNTID:security-group/SECURITYGROUPID",
"Condition": {
"StringEquals": {
"ec2:vpc": "arn:aws:ec2:REGION:ACCOUNTID:vpc/VPCID"
}
}
}
]
}
Additional note: The Databricks production AWS account from which Amazon Machine Images (AMI) are sourced is 601306020600. You can use this account ID to create custom access policies that restrict the AMIs that can be used in your AWS account. For more information, contact your Databricks account team.
- Click Review policy.
- In the Name field, enter a policy name.
- Click Create policy.
- (Optional) If you use Service Control Policies to deny certain actions at the AWS account level, ensure that
sts:AssumeRoleis allowlisted so Databricks can assume the cross-account role. - In the role summary, copy the Role ARN to paste into the credential configuration step.
Step 3: Create the credential configuration
The credential configuration is a Databricks configuration object that represents the IAM role that you created in the previous step.
To create a credential configuration:
- When creating the new workspace, in the Cloud credential dropdown menu, select Add cloud credential.
- Select Add manually.
- In the Cloud credential name field, enter a human-readable name for your new credential configuration.
- In the Role ARN field, paste the role ARN that you copied in the previous step.
- Click OK.
Databricks validates the credential configuration during this step. Possible errors can include an invalid ARN or incorrect permissions for the role, among others.
Create a storage configuration
In the storage configuration step, you create a storage bucket to store your Databricks workspace assets such as data, libraries, and logs. This is also where the workspace's default catalog is stored. As part of the storage configuration, you also create an IAM role that Databricks uses to access the storage location.
Step 1: Create an S3 bucket
-
Log into your AWS Console as a user with administrator privileges and go to the S3 service.
-
Click the Create bucket button.
-
Enter a name for the bucket.
-
Select the AWS region that you will use for your Databricks workspace deployment.
-
Click Create bucket.
-
Click the Permissions tab.
-
In the Bucket policy section, click Edit.
-
Add the following bucket policy, replacing
<BUCKET-NAME>with your bucket's name and<YOUR-DATABRICKS-ACCOUNT-ID>with your Databricks account ID.JSON{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Grant Databricks Access",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::414351767826:root"
},
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": ["arn:aws:s3:::<BUCKET-NAME>/*", "arn:aws:s3:::<BUCKET-NAME>"],
"Condition": {
"StringEquals": {
"aws:PrincipalTag/DatabricksAccountId": ["<YOUR-DATABRICKS-ACCOUNT-ID>"]
}
}
},
{
"Sid": "Prevent DBFS from accessing Unity Catalog metastore",
"Effect": "Deny",
"Principal": {
"AWS": "arn:aws:iam::414351767826:root"
},
"Action": ["s3:*"],
"Resource": ["arn:aws:s3:::<BUCKET-NAME>/unity-catalog/*"]
}
]
} -
Save the bucket.
Step 2: Create an IAM role with a custom trust policy
This IAM role and trust policy establishes a cross-account trust relationship so that Databricks can access data in the S3 bucket on behalf of Databricks users. The ARN in the Principal section is a static value that references a role created by Databricks. The ARN is slightly different if you use Databricks on AWS GovCloud.
-
In your AWS account, create an IAM role with a Custom Trust Policy.
-
In the Custom Trust Policy field, paste the following policy JSON:
The policy sets the external ID to
0000as a placeholder. You update this to the account ID of your Databricks account in a later step.- Databricks on AWS
- Databricks on AWS GovCloud
- Databricks on AWS GovCloud DoD
JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL"]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "0000"
}
}
}
]
}JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws-us-gov:iam::044793339203:role/unity-catalog-prod-UCMasterRole-1QRFA8SGY15OJ"]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "0000"
}
}
}
]
}JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws-us-gov:iam::170661010020:role/unity-catalog-prod-UCMasterRole-1DI6DL6ZP26AS"]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "0000"
}
}
}
]
} -
Save the IAM role.
Now that you have created the role, you must update its trust policy to make it self-assuming.
-
In the IAM role you just created, go to the Trust Relationships tab and edit the trust relationship policy as follows, replacing the
<YOUR-AWS-ACCOUNT-ID>,<THIS-ROLE-NAME>, and<YOUR-DATABRICKS-ACCOUNT-ID>values.- Databricks on AWS
- Databricks on AWS GovCloud
- Databricks on AWS GovCloud DoD
JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL",
"arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<YOUR-DATABRICKS-ACCOUNT-ID>"
}
}
}
]
}JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws-us-gov:iam::044793339203:role/unity-catalog-prod-UCMasterRole-1QRFA8SGY15OJ",
"arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<YOUR-DATABRICKS-ACCOUNT-ID>"
}
}
}
]
}JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws-us-gov:iam::170661010020:role/unity-catalog-prod-UCMasterRole-1DI6DL6ZP26AS",
"arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<YOUR-DATABRICKS-ACCOUNT-ID>"
}
}
}
]
} -
Skip the permissions policy configuration. You'll go back to add that in a later step.
-
Copy the IAM role ARN, which you'll paste into the storage configuration step.
Step 3: Create an IAM policy to grant read and write access
-
Create an IAM policy in the same account as the S3 bucket, replacing the following values:
<BUCKET>: The name of the S3 bucket.<AWS-ACCOUNT-ID>: The Account ID of your AWS account (not your Databricks account).<AWS-IAM-ROLE-NAME>: The name of the AWS IAM role that you created in the previous step.<KMS-KEY>(Optional): If encryption is enabled, provide the name of the KMS key that encrypts the S3 bucket contents. If encryption is disabled, remove the entire KMS section of the IAM policy.
This IAM policy grants read and write access. You can also create a policy that grants read access only. However, this may be unnecessary, because you can mark the storage credential as read-only, and any write access granted by this IAM role will be ignored.
JSON{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::<BUCKET>/unity-catalog/*"
},
{
"Effect": "Allow",
"Action": ["s3:ListBucket", "s3:GetBucketLocation"],
"Resource": "arn:aws:s3:::<BUCKET>"
},
{
"Action": ["kms:Decrypt", "kms:Encrypt", "kms:GenerateDataKey*"],
"Resource": ["arn:aws:kms:<KMS-KEY>"],
"Effect": "Allow"
},
{
"Action": ["sts:AssumeRole"],
"Resource": ["arn:aws:iam::<AWS-ACCOUNT-ID>:role/<AWS-IAM-ROLE-NAME>"],
"Effect": "Allow"
}
]
}noteIf you need a more restrictive IAM policy for Unity Catalog, contact your Databricks account team for assistance.
-
Create a separate IAM policy for file events in the same account as the S3 bucket.
noteThis step is optional but highly recommended. If you do not grant Databricks access to configure file events on your behalf, you must configure file events manually for each location. If you do not, you will have limited access to critical features that Databricks releases in the future.
The IAM policy grants Databricks permission to update your bucket's event notification configuration, create an SNS topic, create an SQS queue, and subscribe the SQS queue to the SNS topic. These are required resources for features that use file events. Replace
<BUCKET>with the name of the S3 bucket.JSON{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ManagedFileEventsSetupStatement",
"Effect": "Allow",
"Action": [
"s3:GetBucketNotification",
"s3:PutBucketNotification",
"sns:ListSubscriptionsByTopic",
"sns:GetTopicAttributes",
"sns:SetTopicAttributes",
"sns:CreateTopic",
"sns:TagResource",
"sns:Publish",
"sns:Subscribe",
"sqs:CreateQueue",
"sqs:DeleteMessage",
"sqs:ReceiveMessage",
"sqs:SendMessage",
"sqs:GetQueueUrl",
"sqs:GetQueueAttributes",
"sqs:SetQueueAttributes",
"sqs:TagQueue",
"sqs:ChangeMessageVisibility",
"sqs:PurgeQueue"
],
"Resource": ["arn:aws:s3:::<BUCKET>", "arn:aws:sqs:*:*:*", "arn:aws:sns:*:*:*"]
},
{
"Sid": "ManagedFileEventsListStatement",
"Effect": "Allow",
"Action": ["sqs:ListQueues", "sqs:ListQueueTags", "sns:ListTopics"],
"Resource": "*"
},
{
"Sid": "ManagedFileEventsTeardownStatement",
"Effect": "Allow",
"Action": ["sns:Unsubscribe", "sns:DeleteTopic", "sqs:DeleteQueue"],
"Resource": ["arn:aws:sqs:*:*:*", "arn:aws:sns:*:*:*"]
}
]
} -
Return to the IAM role that you created in Step 2.
-
In the Permission tab, attach the IAM policies that you just created.
Step 4: Create the storage configuration
Now, return to the workspace creation flow so you can manually create the storage configuration in Databricks:
- In the Cloud storage dropdown menu, select Add new cloud storage.
- Select Add manually.
- In the Storage configuration name field, enter a human-readable name for the storage configuration.
- In the Bucket name field, enter the name of the S3 bucket you created in your AWS account.
- In the IAM role ARN field, paste the ARN of the IAM role you created in step 2.
- Click OK.
Advanced configurations
The following configurations are optional when you create a new workspace. To view these settings, click the Advanced configurations dropdown in the Credentials step.
- Metastore: Confirm the metastore assignment for your workspace. The metastore is automatically selected if a Unity Catalog metastore already exists in the workspace's region and the metastore is configured to be automatically assigned to new workspaces. If this is the first workspace you are deploying in a region, the metastore is created automatically. Metastores are created without metastore-level storage by default. If you want metastore-level storage, you can add it. See Add managed storage to an existing metastore.
- Network configuration: To create the workspace in your own VPC, select or add a Network configuration. For instructions on configuring your own VPC, see Configure a customer-managed VPC. If you are using a customer-managed VPC, ensure your IAM role uses an access policy that supports customer-managed VPCs.
- Private Link: To enable PrivateLink, select or add a private access setting. This requires that you create the correct regional VPC endpoints, register them, and reference them from your network configuration.
- Customer-managed keys: You can add encryption keys to your workspace deployment for managed services and workspace storage. The key for managed services encrypts notebooks, secrets, and Databricks SQL query data in the control plane. The key for workspace storage encrypts your workspace storage bucket and the EBS volumes of compute resources in the classic compute plane. For more guidance, see Configure customer-managed keys for encryption.
- Security and compliance: These checkboxes allow you to enable the compliance security profile, add compliance standards, and enable enhanced security monitoring for your workspace. For more information, see Configure enhanced security and compliance settings.
View workspace status
After you create a workspace, you can view its status on the Workspaces page.
- Provisioning: In progress. Wait a few minutes and refresh the page.
- Running: Successful workspace deployment.
- Failed: Failed deployment.
- Banned: Contact your Databricks account team.
- Cancelling: In the process of cancellation.
Log into a workspace
- Go to the account console and click the Workspaces icon.
- On the row with your workspace, click Open.
- To log in as a workspace administrator, log in with your account owner or account administrator email address and password. If you configured single-sign on, click the Single Sign On button.
Next steps
Now that you have deployed a workspace, you can start building out your data strategy. Databricks recommends the following articles:
- Add users, groups, and service principals to your workspace. Manage users, service principals, and groups.
- Learn about data governance and managing data access in Databricks. See What is Unity Catalog?.
- Connect your Databricks workspace to your external data sources. See Connect to data sources and external services.
- Ingest your data into the workspace. See Standard connectors in Lakeflow Connect.
- Learn about managing access to workspace objects like notebooks, compute, dashboards, queries. See Access control lists.