The purpose of the permissions in the Databricks cross-account IAM role

This article lists permissions in the cross-account IAM role and the purpose of each role.

The permissions are different based on how you configure your VPC.

IAM permissions for Databricks-managed VPC

Databricks requires the following list of IAM permissions to operate and manage clusters in an effective manner. This configuration applies only to workspaces that use the default (Databricks-managed) VPC. To create the AWS cross-account role policy for use with the default Databricks-managed VPC, see Create a cross-account IAM role.

The following table lists Databricks IAM cross-account role permissions in the default configuration, the resources that they control, and the purpose for each permission.

AWS IAM permission

AWS resource

Purpose

ec2:AllocateAddress

Elastic IP address

Allocates an Elastic IP that is associated with the NAT Gateway used in secure cluster connectivity

ec2:AssociateDhcpOptions

DHCP

Associates a set of DHCP options (or no DHCP options) with a VPC.

ec2:AssociateIamInstanceProfile

InstanceProfile

Associate an instance profile with a running EC2 instance. This allows a Databricks pool instance to be used by clusters with different instance profiles throughout its lifetime in the pool.

ec2:AssociateRouteTable

RouteTable

Associates a subnet with a route table.

ec2:AttachInternetGateway

InternetGateway

Attaches an Internet gateway to a VPC, enabling connectivity between the Internet and the VPC. This is currently required to connect to S3 buckets and update code for the workers and spark containers.

ec2:AttachVolume

EBS Volume

Attaches volume for EBS auto-scaling.

ec2:AuthorizeSecurityGroupEgress

SecurityGroup

Adds egress rules to the security groups if required.

ec2:AuthorizeSecurityGroupIngress

SecurityGroup

Adds ingress rules to the security groups.

ec2:CancelSpotInstanceRequests

SpotInstance

Cancels spot instances.

ec2:CreateDhcpOptions

Dhcp

Creates DHCP options.

ec2:CreateInternetGateway

InternetGateway

Creates an Internet Gateway.

ec2:CreateNatGateway

NatGateway

Creates a NAT gateway

ec2:CreateRoute

Route

Create routes during workspace setup

ec2:CreateRouteTable

RouteTable

Create routes during workspace setup

ec2:CreateServiceLinkedRole

ServiceLinkedRole

Sets up support for spot instances.

ec2:CreateSecurityGroup

SecurityGroup

Creates security groups during initial setup

ec2:CreateSubnet

Subnet

Create subnets for the VPC during workspace setup.

ec2:CreateTags

Tags

Adds tags on Databricks resources.

ec2:CreateVolume

EBS Volume

Creates volume.

ec2:CreateVpc

VPC

Creates the Databricks-managed VPC.

ec2:CreateVpcEndpoint

VPCEndpoint

Creates VPC endpoints as part of configuring the VPC.

ec2:DeleteDhcpOptions

DHCPOptions

Deletes DHCPOptions

ec2:DeleteInternetGateway

InternetGateway

Deletes Internet Gateway during workspace deletion.

ec2:DeleteNatGateway

NatGateway

Deletes NAT gateway as needed to setup the secure cluster connectivity relay.

ec2:DeleteRoute

Route

Deletes routes.

ec2:DeleteRouteTable

RouteTable

Deletes route table.

ec2:DeleteSecurityGroup

SecurityGroup

Deletes security groups during workspace deletion.

ec2:DeleteSubnet

Subnet

Deletes subnet.

ec2:DeleteTags

Tags

Removes tags from cluster resources to allows Databricks pool instances to be reused by clusters with different tags.

ec2:DeleteVolume

EBS Volume

Deletes a volume for EBS auto-scaling. See this page.

ec2:DeleteVpc

VPC

Delete the VPC when customers during workspace deletion.

ec2:DeleteVpcEndpoints

VPCEndpoints

Delete the VPC endpoints during workspace deletion

ec2:DescribeAvailabilityZones

AvailabilityZones

Gets a list of Availability Zones in a region so that Databricks can deploy resources in that zone.

ec2:DescribeIamInstanceProfileAssociations

InstanceProfile

Checks the current instance profile that is set on an EC2 instance so that the right profile is set on a Databricks pool instance before it’s reused by a cluster.

ec2:DescribeInstanceStatus

Instance

Confirms that Databricks AWS instances are healthy.

ec2:DescribeInstances

Instance

Confirms that Databricks AWS instances are healthy.

ec2:DescribeInternetGateways

InternetGateway

Describes InternetGateway to confirm that Databricks AWS instances have a route to the internet.

ec2:DescribeNatGateways

NATGateway

Describes a NAT Gateway to confirm that Databricks AWS instances have a route to the internet in the secure cluster connectivity architecture.

ec2:DescribePrefixLists

PrefixList

Creates a prefix list ID to create an outbound security group rule that allows traffic from a VPC so that Databricks can access an AWS service through a gateway VPC endpoint.

ec2:DescribeReservedInstancesOfferings

Instance

Describes Reserved Instance pricing in support of AWS spot instance pricing.

ec2:DescribeRouteTables

RouteTable

Confirms that the route tables are set up correctly in the Databricks-managed VPC.

ec2:DescribeSecurityGroups

SecurityGroup

Confirms that AWS security groups are set up correctly.

ec2:DescribeSpotInstanceRequests

Instance

Describes spot instances.

ec2:DescribeSpotPriceHistory

SpotInstance

Describes spot instances.

ec2:DescribeSubnets

Subnet

Confirms that subnets are setup correctly in Databricks VPC.

ec2:DescribeVolumes

Volume

Lists volumes.

ec2:DescribeVpcs

VPC

Confirm that the workspace’s VPC was set up correctly.

ec2:DetachInternetGateway

InternetGateway

Detaches the Databricks created Internet Gateway during workspace deletion.

ec2:DisassociateIamInstanceProfile

InstanceProfile

Disassociates an instance profile from an EC2 instance so that xDatabricks pool instances can be used by clusters with different instance profiles.

ec2:DisassociateRouteTable

RouteTable

Detaches the Databricks created route table during workspace deletion.

ec2:ModifyVpcAttribute

VPCAttribute

Configures the Databricks-managed VPC.

ec2:PutRolePolicy

RolePolicy

Configures Databricks to use spot instances.

ec2:ReleaseAddress

Address

Detach the Databricks created address during workspace deletion.

ec2:ReplaceIamInstanceProfileAssociation

InstanceProfile

Swaps one instance profile for another on an EC2 instance so that Databricks pool instances can be used by clusters with different instance profiles.

ec2:RequestSpotInstances

SpotInstance

Requests spot instances.

ec2:RevokeSecurityGroupEgress

SecurityGroup

Updates Databricks-managed security groups if required.

ec2:RevokeSecurityGroupIngress

SecurityGroup

Updates security groups.

ec2:RunInstances

Instance

Launches AWS instances to create Spark Clusters. Also leveraged during scaling up an existing Spark cluster.

ec2:TerminateInstances

Instance

Terminates Spark EC2 nodes during cluster scale down or to terminate a given Spark cluster.

IAM permissions for customer-managed VPC

If you use a customer-managed VPC, there’s a smaller set of permissions needed for the cross-account IAM role. This feature requires the Premium or Enterprise tier.

To create the AWS cross-account role policy for use with a customer-managed VPC, see Customer-managed VPC with default policy restrictions.

The permissions can be further scoped down if needed. To create the AWS cross-account role policy for use with a customer-managed VPC with additional custom restrictions on resources, see Customer-managed VPC with custom policy restrictions.

The following table lists Databricks IAM cross-account role permissions for a customer-managed VPC, the resources that they control, and the purpose for each permission.

AWS IAM permission

AWS resource

Purpose

ec2:AssociateIamInstanceProfile

InstanceProfile

Associates an instance profile with a running EC2 instance so that a Databricks pool instance can be used by clusters with different instance profiles throughout its lifetime in the pool.

ec2:AttachVolume

Volume

Attaches a volume.

ec2:AuthorizeSecurityGroupEgress

SecurityGroup

Add egress rules to the security groups if required.

ec2:AuthorizeSecurityGroupIngress

SecurityGroup

Adds ingress rules to the security groups.

ec2:CancelSpotInstanceRequests

SpotInstance

Cancels spot instances.

ec2:CreateTags

Tags

Adds tags on Databricks resources.

ec2:CreateVolume

Volume

Creates a volume.

ec2:DeleteTags

Tags

Removes tags from cluster resources so that Databricks pool instances can be reused by clusters with different tags.

ec2:DeleteVolume

Volume

Deletes a volume.

ec2:DescribeAvailabilityZones

AvailabilityZones

Gets a list of Availability Zones in a region so that Databricks can deploy the resources in that zone.

ec2:DescribeIamInstanceProfileAssociations

InstanceProfile

Checks the current instance profile set on an EC2 instance to confirm that the right profile is set on a Databricks pool instance before it’s reused by a cluster.

ec2:DescribeInstanceStatus

Instance

Confirms that Databricks AWS instances are healthy.

ec2:DescribeInstances

Instance

Confirm that Databricks AWS instances are healthy.

ec2:DescribeInternetGateways

InternetGateway

Describes InternetGateway to confirm that Databricks AWS instances have a route to the internet.

ec2:DescribeNatGateways

NATGateway

Describes NATGateway to confirm that Databricks AWS instances have a route to the internet in the secure cluster connectivity architecture.

ec2:DescribeNetworkAcls

NetworkAcl

Confirms the correct Network ACL setup.

ec2:DescribePrefixLists

PrefixList

Gets a list of prefix list IDs to create an outbound security group rule that allows traffic from a VPC to access an AWS service through a gateway VPC endpoint.

ec2:DescribeReservedInstancesOfferings

Instance

Gets Reserved Instance pricing as the starting point for AWS spot instance pricing.

ec2:DescribeRouteTables

RouteTable

Confirms that route tables are set up correctly in the VPC.

ec2:DescribeSecurityGroups

SecurityGroup

Confirms that AWS security groups are set up correctly.

ec2:DescribeSpotInstanceRequests

Instance

Describes spot instance.

ec2:DescribeSpotPriceHistory

SpotInstance

Describes spot instances.

ec2:DescribeSubnets

Subnet

Confirms that subnets are setup correctly in the VPC.

ec2:DescribeVolumes

Volume

List volumes.

ec2:DescribeVpcAttribute

VPC

Describes VPC attributes including but not limited to enableDnsHostnames.

ec2:DescribeVpcs

VPC

Confirms that the Databricks workspace VPC was created.

ec2:DetachVolume

Volume

Detaches an EBS volume from EC2 instances during cluster shutdown.

ec2:DisassociateIamInstanceProfile

InstanceProfile

Disassociates an instance profile from an EC2 instance so that pool instances can be used by clusters with different instance profiles.

ec2:ReplaceIamInstanceProfileAssociation

InstanceProfile

Swaps one instance profile for another on an EC2 instance so that pool instances can be used by clusters with different instance profiles.

ec2:RequestSpotInstances

SpotInstance

Requests spot instances.

ec2:RevokeSecurityGroupEgress

SecurityGroup

Updates Databricks-managed security groups if required

ec2:RevokeSecurityGroupIngress

SecurityGroup

Updates security groups.

ec2:RunInstances

Instance

Launches AWS instances to create Spark Clusters. Also used to scale up an existing Spark cluster.

ec2:TerminateInstances

Instance

Terminates Spark EC2 nodes during cluster scale down or to terminate a Spark cluster.