Skip to main content

Isolated environment

The Isolated environment architecture inherits the Hardened connectivity baseline and adds two requirements: private workspace access and a required external firewall. Workspace access is gated by VPN or inbound PrivateLink, never the public internet. All classic compute egress flows through the firewall for inspection and policy enforcement.

This architecture has:

  • Complete network isolation: All traffic flows through private connections.
  • Private workspace access: VPN or inbound PrivateLink only. The workspace is unreachable from the public internet.
  • Required egress inspection: Firewall inspection of all classic compute outbound traffic.
  • Data exfiltration prevention: Network-layer controls block unauthorized data transfer.

Use this architecture when:

  • Workspace access must be private, like over VPN or inbound PrivateLink.
  • Handling data in highly regulated industries, for example financial services, healthcare, government.
  • Compliance frameworks require egress controls (for example, SOC 2, HIPAA, PCI DSS, and FedRAMP).
  • Implementing enterprise zero-trust security frameworks.
  • Data exfiltration prevention is a requirement.

Prerequisites

  • Existing VPN infrastructure or inbound PrivateLink connectivity.
  • Firewall or network virtual appliance (NVA).

Architecture overview

The Isolated environment architecture routes all traffic through private connections with firewall inspection:

Traffic type

Path

User access

Users → VPN or inbound PrivateLink → Workspace

Classic compute → control

Compute → Classic PrivateLink → Databricks control plane

Classic compute → cloud

Compute → VPC endpoints → AWS services (S3, STS, Kinesis)

Serverless → your resources

Serverless compute → NCC private endpoints → Your S3 buckets or VPC

Classic compute → egress

Compute → External firewall (required) → Inspected internet

Required components

Inbound

Workspace is reachable only over private connections: VPN, inbound PrivateLink, or both depending on your existing infrastructure. Customers typically select one rather than stacking them.

Lock fill icon. Private access settings (disable public access)

This is the gating control that actually blocks public ingress. Without it, the workspace still accepts internet traffic even with PrivateLink configured. PrivateLink becomes an additional path, not the only path.

Create a private access settings object with Public access enabled set to False and attach it to the workspace. When public access is disabled, no public traffic can reach the workspace.

User shield icon. Workspace ingress controls

Configure workspace ingress through context-based ingress (CBI), the recommended ingress policy framework. CBI rules combine network source (IP ranges), identity, authentication mechanism, and access scope into a single allow/deny model, so the network-source attribute does the same job as the standalone IP access list feature, plus more.

IP access lists remain supported and can be configured alongside CBI. When both are configured, a request must be allowed by both controls.

Configuration levels:

Best practices:

  • Start broad, refine based on actual usage.
  • Document IP ranges with purpose and expiration dates.
  • Maintain administrator access through a known-good IP range.
  • Review quarterly and remove obsolete ranges.
warning

Ingress policies and IP access lists can lock you out of your workspace if misconfigured. Always maintain administrator access through a known-good IP range.

Lock share icon. Delta Sharing recipient access control

Delta Sharing uses its own IP access lists configured on recipient objects. This is separate from workspace IP access lists and is not covered by context-based ingress. Applies to open sharing (non-Databricks recipients) only.

See Restrict Delta Sharing recipient access using IP access lists (open sharing).

Link icon. Inbound connectivity

Establishes private connectivity for user access to the workspace UI and API. Users reach the workspace over VPN or inbound PrivateLink, never the public internet.

See Configure Inbound PrivateLink.

Info icon. Custom DNS

Configure private DNS to resolve Databricks endpoints to private IP addresses.

See Configure DNS for AWS inbound Private Link.

Outbound

Serverless egress controls (network policies and NCC private endpoints) are inherited from the Hardened connectivity baseline. This architecture makes the external firewall, optional in Hardened, required for full classic compute egress inspection.

Shield icon. External firewall (required)

Route all egress traffic through a firewall for inspection, logging, and policy enforcement. Options include:

  • AWS Network Firewall, a managed service with lower operational overhead and integration with AWS routing.
  • Third-party firewall appliance (such as Palo Alto) integrated with Gateway Load Balancer for more inspection capabilities. Preferred by organizations with an existing Palo Alto investment.

See IP addresses and domains for Databricks services and assets for the required Databricks endpoints that firewall rules must allow.

tip

For maximum lockdown, consider hosting a private package repository (such as JFrog Artifactory or Sonatype Nexus) for Python, R, and Maven packages. This eliminates the need for firewall rules allowing access to public package indexes like PyPI.

warning

Databricks control plane and SCC relay connections use TLS with certificate pinning. Do not enable TLS inspection (decrypt and re-encrypt) on traffic between your clusters and the Databricks control plane. Doing so causes cluster failures. Configure firewall rules to allow these connections by destination FQDN or IP without TLS interception. See IP addresses and domains for Databricks services and assets for required endpoints.

important

Incorrectly configured firewall rules can break Databricks capabilities. Test thoroughly in a non-production environment.

Lock fill icon. Data exfiltration protection

Configure network policies and firewall controls to prevent unauthorized data exfiltration:

  • Serverless egress control through network policies.
  • Classic compute egress through firewall/NVA.
  • Private endpoint rules for approved data destinations.

See Data exfiltration protection for implementation guidance.

Classic compute baseline

The classic compute baseline is inherited from Managed security, and cloud service endpoints are inherited from Hardened connectivity. No additional classic compute components are required for this architecture.

The baseline includes customer-managed VPC, Secure Cluster Connectivity (SCC), and classic PrivateLink. Cloud service endpoints include VPC endpoints, VPC endpoint policies, and S3 bucket policies.

Egress approaches for data access

There are two approaches to handling outbound data access from compute resources:

  • NAT gateway with firewall: Deploy a NAT gateway for outbound connectivity and route traffic through a firewall for inspection. This approach allows controlled access to external package repositories and APIs and maintains visibility into traffic patterns. Use this approach when you must access external resources but require inspection and logging.

  • No NAT gateway (fully private): Remove the NAT gateway entirely to eliminate all public communication from compute resources. All data access occurs through private endpoints and VPC endpoints only. This approach has the highest level of security by removing the possibility of data exfiltration through public egress paths. Use this approach when your organization prohibits any public internet communication from compute resources.

Implementation

Start from a deployed Hardened connectivity baseline. The following phases add the private workspace access and required external firewall that define this architecture.

Phase 1: Inbound controls

  1. Configure inbound PrivateLink so user access to the Databricks web application and REST APIs flows through PrivateLink instead of public IPs. See Configure Inbound PrivateLink.
  2. Create a private access settings object with Public access enabled set to False and attach it to the workspace. This is what actually blocks public ingress. Without it, the workspace still accepts internet traffic even with inbound PrivateLink configured.
  3. Test user access through the corporate VPN or PrivateLink path to verify that traffic to the workspace is routed over the private network as intended, and that public access is blocked.

Phase 2: External firewall (required)

  1. Deploy a third-party firewall appliance (such as Palo Alto) in a hub VPC and integrate it with the workspace VPC using appropriate routing (for example, Transit Gateway or VPC peering), or use AWS Network Firewall.
  2. Configure route tables to send 0.0.0.0/0 to the firewall. Firewall rules must allow required Databricks endpoints (see IP addresses and domains for Databricks services and assets), cloud service endpoints, and approved external services.
  3. Configure firewall rules without TLS interception on control plane and SCC relay traffic.

Phase 3: Validation

  1. Verify egress control by reviewing firewall logs to verify cluster and serverless traffic only reaches approved external or internal endpoints.
  2. Confirm there are no public IP addresses on cluster nodes or other Databricks-managed compute resources.
  3. Validate that all control, data, and incoming traffic flows through the configured PrivateLink endpoints and firewall paths.

The Databricks Terraform SRA provides Infrastructure-as-Code templates that automate this deployment pattern.

Validation

After you deploy the architecture, run the following checks to confirm that full network isolation, private connectivity, and egress controls work as configured.

Check

Expected result

Workspace accessible via VPN

Yes

Workspace accessible without VPN

No

Clusters launch with SCC

Yes, no public IPs

Data access through private connections

Yes

Egress blocked without firewall approval

Yes

DNS resolves to private IPs

Yes

Troubleshooting

If a validation check fails or a workload can't connect to a required endpoint, use the cloud-specific table that follows to diagnose common issues.

Issue

Cause

Resolution

Cluster fails to start

Firewall blocking required endpoints or misconfigured VPC endpoints for SCC, Databricks control plane, S3, Kinesis, or STS (security groups, routing)

Review firewall logs and add Databricks infrastructure rules; verify VPC endpoint security groups allow traffic from cluster subnets; check route tables

DNS resolution fails

Private DNS misconfigured

Verify Route 53 private hosted zones and VPC associations

S3 access fails

VPC endpoint or routing issue

Check S3 gateway endpoint configuration and route tables

Package installation fails

PyPI blocked by firewall

Add PyPI to firewall allow list

Ongoing maintenance

  • Firewall rules: Review and update egress allow-lists regularly.
  • DNS management: Update records when you add workspaces.
  • Endpoint monitoring: Track private endpoint health and data transfer costs.
  • Network policies: Add private endpoints for new approved data sources.
  • Remove firewall: If firewall operational overhead is too high or compliance requirements relax, you can remove the firewall component and keep private connectivity and VPN access.
  • Downgrade to Hardened connectivity: If private workspace access becomes a productivity barrier.

Next steps

    • Data exfiltration protection
    • Detailed reference architecture for combining network and Unity Catalog controls to prevent data exfiltration.
    • Networking
    • Networking options and concepts for Databricks.