Classic compute plane networking
This article introduces features to customize network access between the Databricks control plane and the classic compute plane. Connectivity between the control plane and the serverless compute plane is always over the cloud network backbone and not the public internet.
To learn more about the control plane and the compute plane, see Databricks architecture overview.
To learn more about classic compute and serverless compute, see Types of compute.
The features in this section focus on establishing and securing the connection between the Databricks control plane and classic compute plane. This connection is labeled as 2 the diagram below:
What is secure cluster connectivity?
All new workspaces are created with secure cluster connectivity by default. Secure cluster connectivity means that customer VPCs have no open ports and classic compute plane resources have no public IP addresses. This simplifies network administration by removing the need to configure ports on security groups or network peering.
Secure cluster connectivity ensures that clusters connect to the Databricks control plane through a secure tunnel using HTTPS (port 443) without requiring public IP addresses on cluster nodes. This connection is established using a secure cluster connectivity relay, which separates the network traffic for the web application and REST API from cluster management tasks.
Although the serverless compute plane does not use the secure cluster connectivity relay for the classic compute plane, serverless compute resources do not have public IP addresses.
Deploy a workspace in your own VPC
An AWS Virtual Private Cloud (VPC) lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network. The VPC is the network location for your Databricks clusters. By default, Databricks creates and manages a VPC for the Databricks workspace.
You can instead provide your own VPC to host your Databricks clusters, enabling you to maintain more control of your own AWS account and limit outgoing connections. To take advantage of a customer-managed VPC, you must specify a VPC when you first create the Databricks workspace. For more information, see Configure a customer-managed VPC.
Peer the Databricks VPC with another AWS VPC
By default, Databricks creates and manages a VPC for the Databricks workspace. For additional security, workers that belong to a cluster can only communicate with other workers that belong to the same cluster. Workers cannot talk to any other EC2 instances or other AWS services running in the Databricks VPC. If you have any AWS service running on the same VPC as that of the Databricks cluster, you might not be able to talk to the service because of this firewall restriction. You can run such services outside of the Databricks VPC and peer with that VPC to connect to those services. See VPC peering.
Enable private connectivity from the control plane to the classic compute plane
AWS PrivateLink provides private connectivity from AWS VPCs and on-premises networks to AWS services without exposing the traffic to the public network. You can enable private connectivity from the classic compute plane to Databricks workspace’s core services in the control plane by enabling AWS Private Link.
For more information, see Enable private connectivity using AWS PrivateLink.