Limit network egress for your workspace using a firewall
This page shows you how to configure VPC firewall rules, routes, and Private DNS to restrict network egress from your Databricks workspace on Google Cloud to only essential services and destinations.
Before you begin, you'll need:
- Familiarity with Databricks architecture. See High-level architecture.
- If you plan to use
gcloudCLI commands, install the Google Cloud SDK. - You must have the
roles/iam.networkAdminrole, or sufficient permissions.
Firewall configuration overview
This configuration uses a deny-by-default approach: block all egress traffic, then selectively allow only required connections. The configuration has three main components:
-
Private Google Access: Enable VMs to reach Google Cloud Storage and other Google services over Google's internal network. This allows Databricks to write workspace logs to GCS without requiring public IPs on cluster nodes.
-
VPC routes and DNS: Route Google API traffic through
restricted.googleapis.com, which provides access only to VPC Service Controls-compliant services. This prevents data exfiltration to unsupported Google services. -
Firewall rules for Databricks services: Allow egress to specific Databricks control plane endpoints:
- Databricks control plane and Databricks managed system resources
- Intra-subnet communication for cluster functionality
- Default Hive metastore (port 3306), unless you use an External Apache Hive metastore (legacy)
Control plane service IP addresses vary by region. Keep IP addresses and domains for Databricks services and assets available when configuring firewall rules in Step 3. You'll need the regional IP addresses and ports for your workspace.
Step 1: Plan your network sizing
Before creating your workspace, plan your network size to ensure sufficient IP space for your workloads. Changing the subnet after workspace creation requires updating the workspace's network configuration. See Update workspace network configuration.
If the subnet is too small, the workspace exhausts its IP space and causes jobs to fail. For guidance on sizing your subnet based on cluster size and workspace count, see Subnet sizing for a new workspace.
Step 2: Add VPC firewall rules
After you create your workspace, configure the firewall rules to implement the deny-by-default security model. Replace the placeholders in the commands below with your workspace-specific values:
<vpc-name>: Your Databricks-managed VPC (format:databricks-managed-<workspace-ID>)<workspace-id>: Your workspace ID from the workspace URL- Regional IP addresses from IP addresses and domains for Databricks services and assets
Get your VPC name
- From the account console workspace page, click your workspace to launch it.
- Copy the numeric part of the URL after
?o=. For example, if the workspace URL ishttps://1676665108650415.5.gcp.databricks.com/?o=1676665108650415#, the workspace ID is1676665108650415. - Your VPC name is
databricks-managed-<workspace-ID>. For example,databricks-managed-1676665108650415.
Alternatively, use the Google Cloud Console VPCs page to view and manage firewall rules through the UI.
Block all egress (deny-by-default)
Create a rule to block all egress traffic. Set priority to 1100 so it applies after all specific allow rules (which use priority 1000-1099). Blocking all egress also blocks access to public package repositories, such as pypi.org and Maven Central. Plan for this before enabling a deny-all rule.
gcloud compute firewall-rules create deny-egress \
--action DENY \
--rules all \
--destination-ranges 0.0.0.0/0 \
--direction EGRESS \
--priority 1100 \
--network <vpc-name>
Set this rule's priority to 1100. Lower priority numbers are evaluated first, so setting a priority below your allow rules, 1000 or 1099, prevents Databricks from accessing essential services. This renders your workspace unusable.
If you need to access internal IP space (10.0.0.0/8, 172.16.0.0/20, or 192.168.0.0/16), add specific egress rules for those ranges with priority 1000-1099.
Allow Google APIs (restricted access)
Enable access to Google Cloud services through VPC Service Controls-compliant endpoints. These two rules work together to provide secure access to Google APIs:
199.36.153.4/30: Core restricted API access for GCE34.126.0.0/18: Entry point for VPC Service Controls-compliant services
gcloud compute firewall-rules create to-google-apis \
--action ALLOW \
--rules all \
--destination-ranges 199.36.153.4/30 \
--direction EGRESS \
--priority 1000 \
--network <vpc-name>
gcloud compute firewall-rules create to-google-apis-entry-point \
--action ALLOW \
--rules tcp:443 \
--destination-ranges 34.126.0.0/18 \
--direction EGRESS \
--priority 1000 \
--network <vpc-name>
Allow intra-subnet communication
Allow traffic within your Databricks subnet for proper cluster functionality. Use the primary IP range for GCE nodes from your workspace's advanced configurations during creation.
gcloud compute firewall-rules create databricks-<workspace-id>-egress-intra-subnet \
--action ALLOW \
--rules all \
--destination-ranges <databricks-subnet-cidr> \
--direction EGRESS \
--priority 1000 \
--network <vpc-name>
Allow Databricks control plane services (if PSC is not enabled)
Allow egress to Databricks control plane endpoints on ports 443 and 8443-8451. Get the regional IP addresses for your workspace from IP addresses and domains for Databricks services and assets.
Replace the placeholder IP addresses:
<web-app-ips>: Regional ingress addresses for the web application (typically multiple CIDR blocks)<scc-relay-ip>: Regional IP address for the secure cluster connectivity relay
gcloud compute firewall-rules create to-databricks-control-plane \
--action ALLOW \
--rules tcp:443,tcp:8443-8451 \
--destination-ranges <web-app-ips>,<scc-relay-ip> \
--direction EGRESS \
--priority 1000 \
--network <vpc-name>
Databricks recommends using a firewall that supports FQDN-based rules instead of IP-based rules. Databricks IP addresses can change and may break workspace or cluster creation if firewall rules aren't updated.
Allow Private Service Connect endpoints
If you enabled Enable Private Service Connect for your workspace, allow egress to the PSC endpoint subnet instead of the public control plane IPs.
gcloud compute firewall-rules create egress-to-databricks-psc-subnet \
--description "Allow egress traffic to Databricks Private Service Connect (PSC) endpoint subnet for workspace access over HTTPS." \
--direction EGRESS \
--priority 1000 \
--network <vpc-name> \
--allow tcp:443,tcp:8443-8451 \
--destination-ranges <psc-subnet-range>
Allow default Hive metastore (optional)
If you use the default metastore, allow egress to the regional metastore IP on port 3306. If you use an External Apache Hive metastore (legacy), create a rule for your external metastore instead.
gcloud compute firewall-rules create to-databricks-managed-hive \
--action ALLOW \
--rules tcp:3306 \
--destination-ranges <metastore-ip> \
--direction EGRESS \
--priority 1000 \
--network <vpc-name>
We recommend using What is Unity Catalog? instead of the legacy Hive metastore.
Add custom rules (optional)
Add any additional egress rules your organization needs for data sources and other systems. Set all custom ALLOW rules to priority values 1000-1099 so the firewall evaluates them before the deny-all rule at 1100.
Your firewall rules should look similar to the following:
Name | Type | Filters | Protocols and ports | Action | Priority |
|---|---|---|---|---|---|
| Egress | IP ranges: 199.36.153.4/30 | All | Allow | 1000 |
| Egress | IP ranges: Regional Control Plane IP ranges or PSC Endpoint IPs | tcp:443, 8443-8451 | Allow | 1000 |
| Egress | IP ranges: Workspace subnet range | All | Allow | 1000 |
| Egress | IP ranges: 0.0.0.0/0 | All | Deny | 1100 |
| Ingress | IP ranges: Workspace subnet range | All | Allow | 1000 |
Step 3: Update VPC routes
Add custom static routes to direct Google API traffic through the restricted VIP ranges. Each route uses the default internet gateway as the next hop. This configuration ensures VMs without public IPs can reach Google's restricted API endpoints.
Create routes for the following destination ranges:
199.36.153.4/30: Private Google Access API endpoints forrestricted.googleapis.com34.126.0.0/18: Restricted API entry point forrestricted.googleapis.com
# Route for Private Google Access API endpoints
gcloud compute routes create allow-private-google-access \
--network <vpc-name> \
--destination-range 199.36.153.4/30 \
--next-hop-gateway default-internet-gateway \
--description "Route for Private Google Access (199.36.153.4/30)"
# Route for Restricted API entry points
gcloud compute routes create allow-restricted-google-apis \
--network <vpc-name> \
--destination-range 34.126.0.0/18 \
--next-hop-gateway default-internet-gateway \
--description "Route for restricted.googleapis.com entry point (34.126.0.0/18)"
For more information about how Cloud NAT interacts with Private Google Access, see NAT product interactions in the Google Cloud documentation.
Step 4: Configure Private DNS for restricted Google APIs
Configure a private DNS zone to redirect all Google API traffic to restricted.googleapis.com. This ensures Databricks VMs access only VPC Service Controls-compliant Google APIs, preventing data exfiltration to unsupported services. For more information, see Private Google Access options in the Google Cloud documentation.
Create a private DNS zone for googleapis.com
Set up a Cloud DNS private managed zone with the domain googleapis.com. Attach this zone to the VPC where your workloads run. This ensures your private DNS zone resolves requests for *.googleapis.com, not public DNS.
# Create a private DNS managed zone
gcloud dns managed-zones create restricted-apis-zone \
--description "Private DNS zone for restricted.googleapis.com" \
--dns-name "googleapis.com." \
--visibility private \
--networks "projects/<project-id>/global/networks/<vpc-name>"
Add A records for restricted.googleapis.com
Add A records with all four VIPs (199.36.153.4–199.36.153.7) for restricted.googleapis.com.
# Add A records for restricted.googleapis.com VIPs
gcloud dns record-sets create "restricted.googleapis.com." \
--type A \
--ttl 300 \
--zone restricted-apis-zone \
--rrdatas "199.36.153.4" "199.36.153.5" "199.36.153.6" "199.36.153.7"
Redirect API traffic using CNAMEs
Inside the private DNS zone, create a wildcard CNAME record (*.googleapis.com) pointing to restricted.googleapis.com. This forces all Google API traffic (for example, storage.googleapis.com, bigquery.googleapis.com) to resolve to the restricted VIP.
# Create a wildcard CNAME record pointing to restricted.googleapis.com
gcloud dns record-sets create "*.googleapis.com." \
--type CNAME \
--ttl 300 \
--zone restricted-apis-zone \
--rrdatas "restricted.googleapis.com."
This configuration routes all Google API requests through the restricted endpoint. Only VPC Service Controls-compliant APIs are accessible. This configuration blocks services like Gmail or Maps.
Step 5: Validate configuration
Test your firewall configuration to verify the workspace functions correctly with the restricted network egress:
- Create and start an all-purpose compute resource.
- Confirm that it starts successfully and reaches state Running.
- Attach a notebook to the compute resource and run a simple test such as
SELECT 1+1.
Automate deployment with Terraform
For production deployments or managing multiple workspaces, Databricks recommends using Terraform to automate the firewall configuration described above. The Security Reference Architecture (SRA) provides production-ready Terraform modules that automate:
- VPC firewall rules for deny-by-default egress, Databricks control plane access, and restricted Google APIs
- VPC routes for Private Google Access
- Private DNS zones and records for
restricted.googleapis.com
To get started:
- Clone the SRA repository:
git clone https://github.com/databricks/terraform-databricks-sra.git. - Review the GCP workspace deployment module.
- Configure the module with your project-specific values.
- Apply the Terraform configuration to provision your workspace with firewall rules.
The Security Reference Architecture (SRA) is community-driven and not officially supported by Databricks.