Isolated environment
The Isolated environment architecture inherits the Hardened connectivity baseline and adds two requirements: private workspace access and a required external firewall. Workspace access is gated by VPN or inbound PrivateLink, never the public internet. All classic compute egress flows through the firewall for inspection and policy enforcement.
This architecture has:
- Complete network isolation: All traffic flows through private connections.
- Private workspace access: VPN or inbound PrivateLink only. The workspace is unreachable from the public internet.
- Required egress inspection: Firewall inspection of all classic compute outbound traffic.
- Data exfiltration prevention: Network-layer controls block unauthorized data transfer.
Use this architecture when:
- Workspace access must be private, like over VPN or inbound PrivateLink.
- Handling data in highly regulated industries, for example financial services, healthcare, government.
- Compliance frameworks require egress controls (for example, SOC 2, HIPAA, PCI DSS, and FedRAMP).
- Implementing enterprise zero-trust security frameworks.
- Data exfiltration prevention is a requirement.
Prerequisites
- Databricks Enterprise plan.
- Existing VPN infrastructure or inbound PrivateLink connectivity.
- Firewall or network virtual appliance (NVA).
Architecture overview
The Isolated environment architecture routes all traffic through private connections with firewall inspection:
Traffic type | Path |
|---|---|
User access | Users → VPN or inbound PrivateLink → Workspace |
Classic compute → control | Compute → Classic PrivateLink → Databricks control plane |
Classic compute → cloud | Compute → VPC endpoints → AWS services (S3, STS, Kinesis) |
Serverless → your resources | Serverless compute → NCC private endpoints → Your S3 buckets or VPC |
Classic compute → egress | Compute → External firewall (required) → Inspected internet |
Required components
Inbound
Workspace is reachable only over private connections: VPN, inbound PrivateLink, or both depending on your existing infrastructure. Customers typically select one rather than stacking them.
Private access settings (disable public access)
This is the gating control that actually blocks public ingress. Without it, the workspace still accepts internet traffic even with PrivateLink configured. PrivateLink becomes an additional path, not the only path.
Create a private access settings object with Public access enabled set to False and attach it to the workspace. When public access is disabled, no public traffic can reach the workspace.
Workspace ingress controls
Configure workspace ingress through context-based ingress (CBI), the recommended ingress policy framework. CBI rules combine network source (IP ranges), identity, authentication mechanism, and access scope into a single allow/deny model, so the network-source attribute does the same job as the standalone IP access list feature, plus more.
IP access lists remain supported and can be configured alongside CBI. When both are configured, a request must be allowed by both controls.
Configuration levels:
- Account-level CBI policies: Apply to all workspaces in the account. See Manage context-based ingress policies.
- Workspace-level IP access lists: Apply to a single workspace. See Configure IP access lists for workspaces.
- Account-level IP access lists: Apply to the account console. See Configure IP access lists for the account console.
Best practices:
- Start broad, refine based on actual usage.
- Document IP ranges with purpose and expiration dates.
- Maintain administrator access through a known-good IP range.
- Review quarterly and remove obsolete ranges.
Ingress policies and IP access lists can lock you out of your workspace if misconfigured. Always maintain administrator access through a known-good IP range.
Delta Sharing recipient access control
Delta Sharing uses its own IP access lists configured on recipient objects. This is separate from workspace IP access lists and is not covered by context-based ingress. Applies to open sharing (non-Databricks recipients) only.
See Restrict Delta Sharing recipient access using IP access lists (open sharing).
Inbound connectivity
Establishes private connectivity for user access to the workspace UI and API. Users reach the workspace over VPN or inbound PrivateLink, never the public internet.
Custom DNS
Configure private DNS to resolve Databricks endpoints to private IP addresses.
Outbound
Serverless egress controls (network policies and NCC private endpoints) are inherited from the Hardened connectivity baseline. This architecture makes the external firewall, optional in Hardened, required for full classic compute egress inspection.
External firewall (required)
Route all egress traffic through a firewall for inspection, logging, and policy enforcement. Options include:
- AWS Network Firewall, a managed service with lower operational overhead and integration with AWS routing.
- Third-party firewall appliance (such as Palo Alto) integrated with Gateway Load Balancer for more inspection capabilities. Preferred by organizations with an existing Palo Alto investment.
See IP addresses and domains for Databricks services and assets for the required Databricks endpoints that firewall rules must allow.
For maximum lockdown, consider hosting a private package repository (such as JFrog Artifactory or Sonatype Nexus) for Python, R, and Maven packages. This eliminates the need for firewall rules allowing access to public package indexes like PyPI.
Databricks control plane and SCC relay connections use TLS with certificate pinning. Do not enable TLS inspection (decrypt and re-encrypt) on traffic between your clusters and the Databricks control plane. Doing so causes cluster failures. Configure firewall rules to allow these connections by destination FQDN or IP without TLS interception. See IP addresses and domains for Databricks services and assets for required endpoints.
Incorrectly configured firewall rules can break Databricks capabilities. Test thoroughly in a non-production environment.
Data exfiltration protection
Configure network policies and firewall controls to prevent unauthorized data exfiltration:
- Serverless egress control through network policies.
- Classic compute egress through firewall/NVA.
- Private endpoint rules for approved data destinations.
See Data exfiltration protection for implementation guidance.
Classic compute baseline
The classic compute baseline is inherited from Managed security, and cloud service endpoints are inherited from Hardened connectivity. No additional classic compute components are required for this architecture.
The baseline includes customer-managed VPC, Secure Cluster Connectivity (SCC), and classic PrivateLink. Cloud service endpoints include VPC endpoints, VPC endpoint policies, and S3 bucket policies.
Egress approaches for data access
There are two approaches to handling outbound data access from compute resources:
-
NAT gateway with firewall: Deploy a NAT gateway for outbound connectivity and route traffic through a firewall for inspection. This approach allows controlled access to external package repositories and APIs and maintains visibility into traffic patterns. Use this approach when you must access external resources but require inspection and logging.
-
No NAT gateway (fully private): Remove the NAT gateway entirely to eliminate all public communication from compute resources. All data access occurs through private endpoints and VPC endpoints only. This approach has the highest level of security by removing the possibility of data exfiltration through public egress paths. Use this approach when your organization prohibits any public internet communication from compute resources.
Implementation
Start from a deployed Hardened connectivity baseline. The following phases add the private workspace access and required external firewall that define this architecture.
Phase 1: Inbound controls
- Configure inbound PrivateLink so user access to the Databricks web application and REST APIs flows through PrivateLink instead of public IPs. See Configure Inbound PrivateLink.
- Create a private access settings object with Public access enabled set to False and attach it to the workspace. This is what actually blocks public ingress. Without it, the workspace still accepts internet traffic even with inbound PrivateLink configured.
- Test user access through the corporate VPN or PrivateLink path to verify that traffic to the workspace is routed over the private network as intended, and that public access is blocked.
Phase 2: External firewall (required)
- Deploy a third-party firewall appliance (such as Palo Alto) in a hub VPC and integrate it with the workspace VPC using appropriate routing (for example, Transit Gateway or VPC peering), or use AWS Network Firewall.
- Configure route tables to send
0.0.0.0/0to the firewall. Firewall rules must allow required Databricks endpoints (see IP addresses and domains for Databricks services and assets), cloud service endpoints, and approved external services. - Configure firewall rules without TLS interception on control plane and SCC relay traffic.
Phase 3: Validation
- Verify egress control by reviewing firewall logs to verify cluster and serverless traffic only reaches approved external or internal endpoints.
- Confirm there are no public IP addresses on cluster nodes or other Databricks-managed compute resources.
- Validate that all control, data, and incoming traffic flows through the configured PrivateLink endpoints and firewall paths.
The Databricks Terraform SRA provides Infrastructure-as-Code templates that automate this deployment pattern.
Validation
After you deploy the architecture, run the following checks to confirm that full network isolation, private connectivity, and egress controls work as configured.
Check | Expected result |
|---|---|
Workspace accessible via VPN | Yes |
Workspace accessible without VPN | No |
Clusters launch with SCC | Yes, no public IPs |
Data access through private connections | Yes |
Egress blocked without firewall approval | Yes |
DNS resolves to private IPs | Yes |
Troubleshooting
If a validation check fails or a workload can't connect to a required endpoint, use the cloud-specific table that follows to diagnose common issues.
Issue | Cause | Resolution |
|---|---|---|
Cluster fails to start | Firewall blocking required endpoints or misconfigured VPC endpoints for SCC, Databricks control plane, S3, Kinesis, or STS (security groups, routing) | Review firewall logs and add Databricks infrastructure rules; verify VPC endpoint security groups allow traffic from cluster subnets; check route tables |
DNS resolution fails | Private DNS misconfigured | Verify Route 53 private hosted zones and VPC associations |
S3 access fails | VPC endpoint or routing issue | Check S3 gateway endpoint configuration and route tables |
Package installation fails | PyPI blocked by firewall | Add PyPI to firewall allow list |
Ongoing maintenance
- Firewall rules: Review and update egress allow-lists regularly.
- DNS management: Update records when you add workspaces.
- Endpoint monitoring: Track private endpoint health and data transfer costs.
- Network policies: Add private endpoints for new approved data sources.
- Remove firewall: If firewall operational overhead is too high or compliance requirements relax, you can remove the firewall component and keep private connectivity and VPN access.
- Downgrade to Hardened connectivity: If private workspace access becomes a productivity barrier.
Next steps
-
- Data exfiltration protection
- Detailed reference architecture for combining network and Unity Catalog controls to prevent data exfiltration.
-
- Networking
- Networking options and concepts for Databricks.