This feature is available only if your account is on the E2 version of the Databricks platform. Secure cluster connectivity is enabled for accounts on the E2 platform by default as of September 1, 2020.
With secure cluster connectivity enabled, customer VPCs have no open ports and Databricks Runtime cluster nodes have no public IP addresses.
This article mentions the term data plane, which is the compute layer of the Databricks platform. In the context of this article, data plane refers to the Classic data plane in your AWS account. By contrast, the Serverless data plane that supports Serverless SQL endpoints (Public Preview) runs in the Databricks AWS account. To learn more, see Serverless compute.
- At a network level, each cluster initiates a connection to the control plane secure cluster connectivity relay during cluster creation. The cluster establishes this connection using port 443 (HTTPS) and uses a different IP address than is used for the Web application and REST API.
- When the control plane logically starts new Databricks Runtime jobs or performs other cluster administration tasks, these requests are sent to the cluster through this reverse tunnel.
- The data plane (the VPC) has no open ports, and Databricks Runtime cluster nodes have no public IP addresses.
- Easy network administration, with no need to configure ports on security groups or to configure network peering.
- With enhanced security and simple network administration, information security teams can expedite approval of Databricks as a PaaS provider.
To use secure cluster connectivity for a workspace, create a new workspace using the Account API. You cannot add secure cluster connectivity to an existing workspace.