Skip to main content

Networking

This article introduces networking configurations for the deployment and management of Databricks accounts and workspaces.

note

Databricks charges for networking costs when serverless workloads connect to customer resources. See Understand Databricks serverless networking costs.

Databricks architecture overview

Databricks operates out of a control plane and a compute plane.

  • The control plane includes the backend services that Databricks manages in your Databricks account. The web application is in the control plane.
  • The compute plane is where your data is processed. There are two types of compute planes depending on the compute that you are using.
    • For classic Databricks compute, the compute resources are in your AWS account in what is called the classic compute plane. This refers to the network in your AWS account and its resources. Classic compute plane resources are in the region that your workspace is in.
    • For serverless compute, the serverless compute resources run in a serverless compute plane in your Databricks account. Serverless compute plane resources are in the same cloud region as your workspace's classic compute plane. You select this region when creating a workspace.

To learn more about classic compute and serverless compute, see Compute. For additional architecture information, see High-level architecture.

Secure network connectivity

Databricks provides a secure networking environment by default, but if your organization has additional needs, you can configure network connectivity features between the different networking connections shown in the diagram below.

Network connectivity overview diagram

  1. Users and applications to Databricks: You can configure features to control access and provide private connectivity between users and their Databricks workspaces. See Users to Databricks networking.
  2. The control plane and the classic compute plane: Classic compute resources, such as clusters, are deployed in are your AWS account and connect to the control plane. You can use classic network connectivity features to deploy classic compute plane resources in your own virtual private cloud and to enable private connectivity from the clusters to the control plane. See Classic compute plane networking.
  3. The serverless compute plane and storage: You can configure firewalls on your resources to allow access from Databricks serverless compute plane. See Serverless compute plane networking.

You can configure your AWS storage networking features to secure the connection between the classic compute plane and S3. For more information, see Configure Databricks S3 commit service-related settings and Networking recommendations for Lakehouse Federation.

Connectivity between the control plane and the serverless compute plane is always over the cloud network backbone and not the public internet.

Get started

Understand Databricks networking architecture and explore key concepts.

Topic

Description

Databricks architecture overview

Learn about the control plane and compute plane architecture that forms the foundation of Databricks networking.

AWS PrivateLink

Establish private connections between your network and Databricks using AWS PrivateLink for enhanced security.

Understand data transfer and connectivity costs

Learn about data transfer pricing and optimize costs for network connectivity features.

Connectivity

Configure secure network connections for inbound access to workspaces and outbound connectivity from compute resources.

Topic

Description

Front-end networking

Configure network access controls for users connecting to Databricks workspaces through the web interface and APIs.

Front-end PrivateLink

Enable private connectivity from your corporate network to Databricks workspaces using AWS PrivateLink.

Serverless compute plane networking

Configure secure network access between serverless compute resources and your data sources and services.

Private connectivity to AWS resources

Establish private connections from serverless compute to AWS services like S3, DynamoDB, and RDS.

Private connectivity to resources in your VPC

Connect serverless compute to resources running in your own VPC using private endpoints.

Manage private endpoint rules

Configure and manage private endpoint rules for serverless compute connectivity.

Classic compute plane networking

Learn about networking options for classic compute resources deployed in your virtual network.

Deploy Databricks in a customer-managed VPC

Host Databricks clusters in your own AWS VPC for enhanced network control.

VPC peering

Connect your Databricks VPC to other VPCs in your AWS account to access additional resources.

Back-end PrivateLink

Establish private connectivity between classic compute resources and the Databricks control plane.

Manage private access settings

Configure private connectivity settings at the account level for workspace deployment.

Manage VPC endpoint registrations

Register and manage VPC endpoints for private connectivity to Databricks services.

Configure Databricks S3 commit service-related settings

Optimize S3 write operations with network configurations for the Databricks S3 commit service.

Network security

Implement security controls to restrict and monitor network access.

Topic

Description

What is serverless egress control?

Restrict outbound network connections from serverless compute resources to prevent data exfiltration and enforce compliance.

Manage network policies for serverless egress control

Create and manage network policies that define allowed egress connections from serverless compute.

IP access lists overview

Learn how to use IP access lists to control which IP addresses can access your Databricks workspaces.

IP access lists for workspaces

Configure workspace-level IP access controls to restrict access from approved networks.

IP access lists for the account console

Set account-level IP restrictions that apply across multiple workspaces for centralized security management.

Configure a firewall for serverless compute access

Use stable IP addresses to configure firewall rules for serverless compute connectivity.

Domain name firewall rules

Configure domain-based firewall rules to allow Databricks services through your network security controls.