Serverless compute
With the serverless compute version of the Databricks platform architecture, the compute layer exists in your Databricks account rather than your AWS account.
Serverless SQL warehouses are enabled by default for new workspaces. To add it to an existing workspace, see Enable serverless SQL warehouses.
For details on enforced quoatas for serverless compute, see Serverless quotas.
Model Serving
Databricks Model Serving deploys your MLflow machine learning (ML) models and exposes them as REST API endpoints that run in your Databricks account. The serverless compute resources run as Databricks AWS resources in what is known as the serverless compute plane.
In contrast, the legacy model serving architecture is a single-node cluster that runs in your AWS account within the classic compute plane.
Easy configuration and compute resource management: Databricks automatically prepares a production-ready environment for your model and makes it easy to switch its compute configuration.
High availability and scalability: Serverless model endpoints autoscale, which means that the number of server replicas automatically adjusts based on the volume of scoring requests.
Dashboards: Use the built-in serverless model endpoint dashboard to monitor the health of your model endpoints using metrics such as queries-per-second (QPS), latency, and error rate.
For regional support, see Databricks clouds and regions.
Compare serverless compute to other Databricks architectures
Databricks operates out of a control plane and a compute plane:
The control plane includes the backend services that Databricks manages in its own AWS account. Databricks SQL queries, notebook commands, and many other workspace configurations are stored in the control plane and encrypted at rest.
The compute plane is where data is processed by clusters of compute resources.
There are important differences between the classic compute plane (the original Databricks platform architecture) and the serverless compute plane:
For a classic compute plane, Databricks compute resources run in your AWS account. Clusters perform distributed data analysis using queries (in Databricks SQL) or notebooks (in the Data Science & Engineering or Databricks Machine Learning environments):
New clusters are created within each workspace’s virtual network in the customer’s AWS account.
A classic compute plane has natural isolation because it runs in each customer’s own AWS account.
For a serverless compute plane, Databricks compute resources run in a compute layer within your Databricks account:
The serverless compute plane is used for serverless SQL warehouses and Model Serving. Enabling serverless compute does not change how Databricks Runtime clusters work in the Data Science & Engineering or Databricks Machine Learning environments.
To protect customer data within the serverless compute plane, serverless compute runs within a network boundary for the workspace, with various layers of security to isolate different Databricks customer workspaces and additional network controls between clusters of the same customer.
Databricks creates a serverless compute plane in the same AWS region as your workspace’s classic compute plane.
Worker nodes are private, which means they do not have public IP addresses.
For communication between the Databricks control plane and the serverless compute plane:
For Databricks SQL Serverless, the communication uses private connectivity.
For Model Serving, the communication uses mTLS encrypted communication with connection initiated from the control plane with access limited to control plane IP addresses.
When reading or writing to AWS S3 buckets in the same region as your workspace, serverless SQL warehouses now use direct access to S3 using AWS gateway endpoints. This applies when a serverless SQL warehouse reads and writes to your workspace’s root S3 bucket in your AWS account and to other S3 data sources in the same region.
The following diagram shows important differences between the serverless compute plane and classic compute plane for both serverless features.
For more information about secure cluster connectivity, which is mentioned in the diagram, see Secure cluster connectivity.
The table below summarizes differences between serverless compute and the classic compute plane architecture of Databricks, focusing on product security. It is not a complete explanation of those security features or a detailed comparison.
Item |
Serverless compute plane |
Classic compute plane |
---|---|---|
Location of control plane resources |
Databricks account |
Databricks account |
Location of compute plane resources |
Serverless compute plane in your Databricks account |
Classic compute plane in your cloud account |
Compute plane resources |
Databricks-managed dedicated compute |
Databricks-managed dedicated compute |
Who pays for unassigned VMs for Databricks SQL? |
Databricks |
Customer pays AWS for VMs while launching and cleaning up cluster |
Virtual network (VPC) for compute plane |
In your Databricks account, with network boundaries between workspaces and between clusters |
In your AWS account |
Customize CIDR ranges |
Not applicable |
Yes, if you use a customer-managed VPC |
Public IP addresses for cluster nodes |
No |
If secure cluster connectivity is enabled (the default for all new E2 workspaces), there are no public IPs for VMs. If it is disabled, there is one public IP for each VM. |
Network isolation |
Enforced |
Enforced |
Credential for compute plane access |
Typically Unity Catalog short-lived tokens or serverless short-lived tokens |
Typically Unity Catalog short-lived tokens or instance profiles |