Skip to main content

Phase 4: Design network architecture

In this phase, you design network infrastructure for Databricks workspaces, including architecture patterns, connectivity options, and security controls.

Understand Databricks networking

The network architecture in Databricks governs three distinct communication paths:

  • Inbound (front-end) connectivity: User access to the admin console and workspaces through the UI and APIs.
  • Outbound (serverless) connectivity: Workload connections from Databricks serverless compute to your customer resources.
  • Classic (back-end) connectivity: Secures connections from classic compute plane to the control plane.

How network security is enforced depends directly on the compute model:

  • Classic compute: Workloads run in customer-managed cloud networks, so network posture is primarily implemented through customer-managed segmentation, routing, private connectivity, and egress controls.
  • Serverless compute: Workloads run in a Databricks-managed compute plane, so administrators rely more on platform controls and account-level configurations to govern connectivity, particularly outbound access, while still aligning to the same risk model and enterprise network requirements.

Administrators should view these network controls as complementary to workspace boundaries and workspace-level guardrails. Workspace controls define user operations and permitted execution patterns, while network controls constrain reachability and data movement paths. By using both layers of protection, organizations can reduce the security blast radius and avoid over-reliance on any single control layer.

Design virtual private network configuration

Databricks provides flexible networking options to align with your organization's security and compliance posture across both classic and serverless compute architectures.

Customer-managed virtual private cloud for classic workspaces

Databricks classic workspaces are deployed within a virtual private cloud (virtual private cloud) in your cloud environment. For maximum control, Databricks recommends using customer-managed virtual networks for your classic workspaces. This model provides the greatest control over network topology, subnet ranges, and security groups, which is essential for meeting strict security and compliance requirements.

Serverless compute (Databricks-managed connectivity)

For serverless compute resources (such as serverless SQL warehouses), Databricks manages the compute plane networking, offering operational simplicity and reduced administrative overhead. However, this model still provides robust control over data plane security and access:

  • Secure ingress/egress: Features such as secure cluster connectivity and Private Link ensure private communication between your workspace, compute, and data sources, regardless of the compute model.
  • Serverless private connectivity: Network Connectivity Configurations (NCC) enable you to define egress rules for Databricks-managed serverless compute, offering granular control over where your compute plane traffic is directed.

This layered approach enables you to select the optimal balance of operational simplicity and granular network control for various workloads within your lakehouse architecture.

Secure cluster connectivity (SCC)

Secure cluster connectivity (SCC) has become the default recommendation and is the default deployment mode for workspaces. SCC reverses the call from control plane to compute plane:

Each cluster initiates a connection to the SCC relay in the control plane, establishing a secure communication tunnel. The control plane then sends cluster administration tasks back to the cluster through this tunnel. As a result, no open ports or public IP addresses are required on the classic compute plane nodes. All communication from the classic compute plane to the control plane is outbound.

SCC architecture benefits

  • No public IP addresses required on compute nodes.
  • No inbound ports required on compute security groups.
  • Simplified network security posture.
  • Reduced attack surface for compute resources.

Best practices for SCC

  • Enable SCC for all new workspaces (default).
  • Use SCC as the baseline security posture.
  • SCC is compatible with Private Link and other advanced networking features.

Design IP access control strategy

Configure IP access lists to restrict the IP addresses that can connect to Databricks by checking if the user or API client is coming from a known "good" IP address range, such as a VPN or office network. Established user sessions do not work if the user switches to a "bad" IP address, such as when disconnecting from the VPN.

IP access list levels

  • Workspace-level IP access lists: Applied to individual workspaces.
  • Account-level IP access lists: Applied to all workspaces in the account and account console access.

IP access list patterns

  • Corporate VPN: Allow access only from corporate VPN IP ranges.
  • Office networks: Allow access from specific office locations.
  • Cloud provider networks: Allow access from specific cloud regions or VPCs.
  • Hybrid approach: Combine multiple IP ranges for different user types.

Best practices for IP access lists

  • Start with account-level IP access lists for consistent enforcement.
  • Use workspace-level lists for workspace-specific requirements.
  • Document IP ranges and their purposes.
  • Plan for remote work scenarios (VPN requirements).
  • Test IP access lists before full enforcement.

Design data exfiltration protection

Data exfiltration protection for workspaces can be set up by securing the network, restricting routing, and adding a network firewall to restrict outbound access from workspaces.

Data exfiltration protection patterns

  • Network segmentation: Deploy workspaces in isolated virtual private clouds.
  • Egress filtering: Use network firewalls to control outbound traffic.
  • Private connectivity: Use Private Link to prevent internet exposure.
  • Workspace features: Disable features that could leak data (for example, notebook export, data download buttons).

Best practices for data exfiltration protection

  • Evaluate data exfiltration protections based on data sensitivity.
  • Use Private Link for highly sensitive environments.
  • Configure network firewalls to allow only required destinations.
  • Disable workspace features that could enable data exfiltration.

For detailed data exfiltration setup guidance, see Networking.

Private Link enables private connectivity from cloud providers' virtual networks and on-premises networks to the cloud providers' services, thereby avoiding exposure to the public internet.

Private Link architecture

  • Front-end Private Link: Private connectivity to workspace UI and APIs.
  • Back-end Private Link: Private connectivity from compute to control plane services.

Private Link is only supported at the workspace level. IP access lists can continue to protect account-level services.

Best practices for Private Link

  • Use Private Link for workspaces with highly sensitive data.
  • Enable both front-end and back-end Private Link for maximum isolation.
  • Plan DNS configuration for Private Link endpoints.
  • Test Private Link connectivity before production use.

Design serverless connectivity (NCC)

Serverless compute resources run in the serverless compute plane, which is managed by Databricks. Account admins can configure secure connectivity between the serverless compute plane and their resources using Network Connectivity Configurations (NCC).

NCC capabilities

  • Stable IPs: For firewall allowlisting.

NCC architecture

Account admins create NCCs in the account console, and each NCC can be attached to one or more workspaces. When an NCC is attached to a workspace, serverless compute in that workspace uses the NCC's network configuration to establish secure outgoing connections to customer resources. The specific mechanism depends on the cloud provider, as described in the capabilities above.

note

The NCC does not impact any inbound connectivity to serverless resources.

Best practices for NCC

  • Create separate NCCs for different environments (for example, dev, staging, production).
  • Create separate NCCs for different business units when isolation is required.
  • Use NCCs to control serverless egress to customer resources.
  • Allow-list NCC IP ranges on storage firewalls and databases.

AWS network architecture

Base VPC configuration

For a classic AWS deployment with compute resources deployed in a VPC in a customer's AWS account, the primary architecture requires:

Subnet requirements

  • At least two subnets, each defined in a different availability zone (AZ) within the AWS cloud region.
  • Dedicate subnets to the deployment of EC2 instances of Spark clusters and SQL warehouses.
  • Databricks assigns two IP addresses per node (EC2 instance):
    • One used for management traffic: orchestration, monitoring, and control plane communications.
    • One used by the Spark container for intra-cluster application traffic.

Subnet sizing

Databricks doesn't limit netmasks for the workspace VPC, but each workspace subnet must have a netmask between /17 and /26. The total number of instances for each subnet is equal to half the number of available IP addresses, after removing the five reserved IP addresses in a subnet.

VPC size (CIDR)

Subnet size (CIDR)

Maximum Databricks cluster nodes per subnet/AZ

= /16

/17

16,381 (= (32,768-5) // 2)

= /20

/21

2,045 (= (4,096-5) // 2)

= /25

/26

29 (= (64-5) // 2)

Route table configuration

The route table(s) associated with these subnets should contain routes to:

  • S3 service: Install the S3 Gateway VPC endpoint in the VPC and specify the subnets during installation.
  • Internet access: Route to 0.0.0.0/0 with the NAT gateway (or network firewall) as the target.

Interface type VPC endpoints

Install interface type (PrivateLink-based) VPC endpoints for privately accessing the STS and Kinesis services of AWS in a separate, smaller subnet (one per availability zone). The security groups attached to these endpoints must allow ingress access from the security groups attached to the Databricks clusters.

important

If private access to S3 services is strictly required, an interface type S3 VPC endpoint must also be installed. However, this comes at a high cost as interface type VPC endpoints charge for the amount of data that flows through them. Unless strictly required for compliance reasons, prefer the gateway S3 VPC endpoints, which are free of charge.

NAT Gateway configuration

Establish internet access by installing a NAT Gateway on a separate subnet. For high-availability internet access, deploy a NAT Gateway (and subnet) in each availability zone used by the subnets for the Databricks compute instances. The route table for these subnets must include an entry for 0.0.0.0/0, directing traffic to the Internet Gateway attached to the VPC.

Network firewall (optional)

If you require a network firewall for data exfiltration protection, install it in dedicated subnets (one per availability zone). Configure route tables to:

  • Route tables for NAT gateway subnets: Drive traffic to the internet (0.0.0.0/0) to the NAT gateway.
  • Route tables for Databricks compute subnets: Route traffic to the internet to the network firewall endpoints.
  • Route tables for NAT gateway subnet: Route traffic to the Databricks clusters through the network firewall endpoints.

Sharing network resources by multiple workspaces

You can share a single VPC across multiple workspaces. In this case, you can share the subnets for the NAT gateway, the network firewall, and the VPC endpoints. However, you need to create different subnets for the Databricks cluster deployment (a different set per workspace).

Hub-and-spoke architecture

You can use a VPC hub-and-spoke architecture, where all VPC endpoints, NAT gateways, firewalls, and so on, are installed in the hub VPC. Each workspace is associated with subnets (for launching Databricks clusters) in a different VPC, within the same or a different AWS account. The spoke VPCs are connected to the hub VPC using a Transit Gateway.

Network security controls by risk

Customers with data and workloads that have varying risk levels can combine workspace and network controls to establish clear operational boundaries while reusing shared governance and platform services. A workspace boundary is an effective way to separate domains and environments (such as Dev vs. Prod) and to apply workspace-scoped controls. Network controls then provide an independent enforcement layer that constrains where workloads can run and which destinations they can reach, including access to internal services and the public internet.

Tiered workspace model example

Higher-risk workloads are placed in more restrictive connectivity environments. Workspaces aligned to restricted risk classifications are deployed into tighter VPC/VNet configurations and are limited to approved egress destinations, such as private repositories and internal services. Lower-risk workloads can operate in less restrictive network environments, preserving developer velocity and broader package access.

This model allows administrators to tune controls at multiple layers:

  • Workspace-level controls: Define "who can do what" inside the workspace (access and execution guardrails).
  • Network-level controls: Define "where workloads can connect" (VPC/VNet restrictions and egress controls).

The design goal is not to "pick the right layer," but to apply complementary controls. Use workspaces to create clean administrative boundaries and reduce blast radius between environments. Use network segmentation and egress controls to enforce connectivity constraints that remain effective even when users have broad capabilities inside a workspace.

For serverless workspaces, the same principle applies, but the control surface shifts from customer-managed network constructs to platform controls such as serverless egress control policies.

Network architecture recommendations

Recommended

  • Deploy workspaces into customer-managed virtual networks for maximum control.
  • Subnets must be at least /26, but most use cases require at least /23 (see sizing details above).
  • Align serverless NCC to customer-managed VNet/VPC settings.
  • Use IP access lists to restrict access to known IP ranges.
  • Use hub-and-spoke architecture for shared network resources across multiple workspaces.
  • Plan for high availability by deploying resources across multiple availability zones.

Evaluate based on requirements

  • For customers with strict network security policies:
    • Evaluate additional data exfiltration protections.
    • Consider using Private Link for highly sensitive workloads.
    • Configure network firewalls to control egress traffic.

Phase 4 outcomes

After completing Phase 4, you should have:

  • Network architecture designed for workspaces (customer-managed virtual private cloud).
  • Secure cluster connectivity (SCC) strategy defined.
  • IP access control strategy designed.
  • Data exfiltration protection evaluated (for sensitive workloads).
  • Private Link strategy defined (if required for compliance).
  • Serverless connectivity (NCC) designed for serverless workloads.
  • Cloud-specific network architecture designed (AWS/Azure/GCP).
  • Hub-and-spoke network architecture evaluated.
  • Network security controls aligned with risk levels.
  • Subnet sizing calculated based on expected cluster sizes.

Next phase: Phase 5: Design storage architecture

Implementation guidance: For step-by-step instructions to implement your network design, see Networking.