With the Serverless compute version of the Databricks platform architecture, the compute layer exists in the Databricks cloud account rather than the customer’s cloud account.
As of the current release, Serverless compute is supported for use with Databricks SQL. Admins can create Serverless SQL endpoints that enable instant compute and are managed by Databricks. Serverless SQL endpoints use compute clusters in the Databricks AWS account. Use them with Databricks SQL queries just like you normally would with the original customer-hosted SQL endpoints that are now called Classic SQL endpoints.
Before you can create Serverless SQL endpoints, you must enable Serverless Databricks SQL endpoints for your workspace. If Serverless SQL endpoints are enabled for your workspace:
- New SQL endpoints are Serverless by default when created from the UI or the API, but you can also create new Classic SQL endpoints.
- You can convert a Classic SQL endpoint to a Serverless SQL endpoint or convert from Serverless to Classic.
- This feature only affects Databricks SQL. It does not affect how Databricks Runtime clusters work with notebooks and jobs in the Data Science & Engineering or Databricks Machine Learning workspace environments. Databricks Runtime clusters always run in the Classic data plane in your AWS account. See Compare Serverless compute to other Databricks architectures.
As of this release, Databricks supports Serverless compute in AWS regions
ap-southeast-2. See Supported Databricks regions.
Serverless Compute must be enabled for your workspace and is subject to certain Service Specific Terms. The first time that it is enabled for the workspace, the admin must read and accept the terms and conditions.
Databricks operates out of a control plane and a data plane:
- The control plane includes the backend services that Databricks manages in its own AWS account. Databricks SQL queries, notebook commands, and many other workspace configurations are stored in the control plane and encrypted at rest.
- The data plane is where data is processed by clusters of compute resources.
There are important differences between the Classic data plane (the original Databricks platform architecture) and the Serverless data plane:
- For a Classic data plane, Databricks compute resources run in the customer’s cloud account. Clusters perform distributed data analysis using queries (in Databricks SQL) or notebooks (in the Data Science & Engineering or Databricks Machine Learning environments):
- New clusters are created within each workspace’s virtual network in the customer’s cloud account.
- A Classic data plane has natural isolation because it runs in each customer’s own cloud account. A Classic data plane is not a shared resource for multiple customers.
- For a Serverless data plane, Databricks compute resources run in a special compute layer within the Databricks cloud account:
- As of this release, the Serverless data plane is used only for Serverless SQL endpoints. Enabling this feature does not change how Databricks Runtime clusters work in the Data Science & Engineering or Databricks Machine Learning environments.
- The Serverless data plane is a shared resource in the Databricks cloud account for multiple Databricks customers.
- To protect customer data within the Serverless data plane, Serverless compute runs within a network boundary for the workspace, with various layers of security to isolate different Databricks customer workspaces and additional network controls between clusters of the same customer.
Databricks creates a Serverless data plane in the same AWS region as your workspace’s Classic data plane. Your workspace’s Databricks control plane instance is generally in the same AWS region but is not guaranteed to be in the same region.
The following diagram shows important differences between the Serverless data plane and Classic data plane.
For more information about secure cluster connectivity, which is mentioned in the diagram, see Secure cluster connectivity.
The table below summarizes differences between Serverless compute and the other versions of Databricks, focusing on product security. It is not a complete explanation of those security features or a detailed comparison. For more details about Serverless compute security, or if you have questions about items in this table, contact your Databricks representative.
|Item||Serverless data plane (AWS only)||Classic data plane (AWS and Azure)|
|Location of control plane resources||Databricks cloud account||Databricks cloud account|
|Location of data plane compute resources||Serverless data plane (VPC in the Databricks AWS account)||Classic data plane (VPC in the customer’s cloud provider account)|
|Data plane compute resources||Databricks-managed Kubernetes (EKS) clusters||Databricks-managed standalone VMs|
|Customer access to data plane||Access through Databricks control plane||
|Who pays for unassigned VMs for Databricks SQL?||Databricks||Not applicable. For Classic SQL endpoints, there is no concept of unassigned VMs. In Databricks SQL, there is no direct equivalent to warm instance pools for notebooks and jobs.|
|Who pays for VMs after starting an endpoint or running a query in Databricks SQL?||Customer pays based on DBUs until Auto Stop stops the SQL endpoint.||Customer pays AWS for the VMs, and customer pays Databricks based on DBUs.|
|Virtual private network (VPC) for data plane||VPC in Databricks account is shared among customers, with additional network boundaries between workspaces and between clusters.||
|OS image||Databricks-modified cloud-managed Amazon-linux2||Databricks-managed Ubuntu or CentOS|
|Technology that manages default egress from the VPC||Databricks-created AWS internet gateway||Default internet gateway or load balancer provided by the cloud|
|Customize VPC and firewall settings||No||Yes|
|Customize CIDR ranges||No||Yes|
Secure cluster connectivity
|Container-level network isolation for Databricks Runtime clusters||Uses Kubernetes network policy||Uses Databricks-managed
|VM-level network isolation for Databricks Runtime clusters||Security group isolation||Security group and isolation of VPC (AWS) or VNet (Azure)|
|VM isolation||VMs in a cluster can communicate among themselves, but no ingress traffic is allowed from other clusters.||VMs in a cluster can communicate among themselves, but no ingress traffic is allowed from other clusters.|
|Communication between control plane and data plane||Direct Databricks-managed TLS encrypted communication using public IP, with connection initiated from the control plane.||
Secure cluster connectivity
|Credential for initial deployment||Databricks internal IAM roles||
|Credential for regular data plane operations||Databricks invokes
|Location of the storage for DBFS root and workspace system data||Customer creates the S3 bucket in the customer account as part of workspace creation.||