SQL warehouse sizing, scaling, and queuing behavior

This article explains the cluster sizing, queuing, and autoscaling behavior of SQL warehouses.

Sizing overview

SQL warehouses are available in serverless, pro, and classic types which have different performance features and optimizations that can affect query performance in your warehouse. See SQL warehouse types. Databricks recommends using serverless SQL warehouses when available.

For any warehouse type, you choose a Cluster size for its compute resources. Optimizing your Databricks SQL warehouse size involves more than just considering data volume or user count. Query complexity and the number of concurrent queries are also key factors in performance.

Databricks SQL warehouses use dynamic concurrency to handle these demands. Unlike static-capacity warehouses, Databricks SQL adjusts compute resources in real time to manage concurrent loads and maximize throughput. Each warehouse size category has a fixed compute capacity per unit, but the system scales the number of resources to accommodate varying demands.

Cluster sizes for SQL warehouses

The table in this section maps SQL warehouse cluster sizes to Databricks cluster driver size and worker counts. The driver size only applies to pro and classic SQL warehouses.

note

For serverless SQL warehouses, the cluster sizes may, in some cases, use different instance types than the ones listed in the documentation for pro and classic SQL warehouses for equivalent cluster size. In general, the price/performance ratio of the cluster sizes for serverless SQL warehouses is similar to those for pro and classic SQL warehouses.

Cluster size	Instance type for driver (applies only to pro and classic SQL warehouses)	Worker count
2X-Small	i3.2xlarge	1 x i3.2xlarge
X-Small	i3.2xlarge	2 x i3.2xlarge
Small	i3.4xlarge	4 x i3.2xlarge
Medium	i3.8xlarge	8 x i3.2xlarge
Large	i3.8xlarge	16 x i3.2xlarge
X-Large	i3.16xlarge	32 x i3.2xlarge
2X-Large	i3.16xlarge	64 x i3.2xlarge
3X-Large	i3.16xlarge	128 x i3.2xlarge
4X-Large	i3.16xlarge	256 x i3.2xlarge

The instance size of all workers is i3.2xlarge. If your workspace has the compliance security profile enabled, warehouses will use i3en instance types instead of i3.

note

The information in this table can vary based on product or region availability and workspace type.

Availability zones (AZ) for pro and classic SQL warehouses

For SQL warehouses, AWS availability zones are set to auto (Auto-AZ), where the AZ is automatically selected based on available IPs in the workspace subnets. Auto-AZ retries in other availability zones if AWS returns insufficient capacity errors. For more about availability zones, see the AWS documentation.

Queueing and autoscaling for pro and classic SQL warehouses

Databricks limits the number of queries on a cluster assigned to a SQL warehouse based on the cost to compute their results. The upscaling of clusters per warehouse is based on query throughput, the rate of incoming queries, and the queue size. Databricks recommends a cluster for every 10 concurrent queries. The maximum number of queries in a queue for all SQL warehouse types is 1000.

Databricks adds clusters based on the time it would take to process all currently running queries, all queued queries, and the incoming queries expected in the next two minutes.

If less than 2 minutes, don't upscale.
If 2 to 6 minutes, add 1 cluster.
If 6 to 12 minutes, add 2 clusters.
If 12 to 22 minutes, add 3 clusters.

Otherwise, Databricks adds 3 clusters plus 1 cluster for every additional 15 minutes of expected query load.

In addition, a warehouse is always upscaled if a query waits for 5 minutes in the queue.

If the load is low for 15 minutes, Databricks downscales the SQL warehouse. It keeps enough clusters to handle the peak load over the last 15 minutes. For example, if the peak load was 25 concurrent queries, Databricks keeps 3 clusters.

Serverless autoscaling and query queuing

Intelligent workload management (IWM) is a set of features that enhances the ability of serverless SQL warehouses to process large numbers of queries quickly and cost-effectively. It dynamically manages workloads by using machine learning models to predict the resource demands of incoming queries while monitoring the warehouse's available compute capacity in real time. Tracking these and other signals in the warehouse allows IWM to respond to changes in workload demands.

This dynamic management allows IWM to do the following:

Rapidly upscale compute to maintain low latency.
Provide query admittance at rates closer to the hardware's limitation.
Quickly downscale to minimize costs when demand is low.

When a query arrives to the warehouse, IWM predicts its cost. At the same time, IWM is real-time monitoring the available compute capacity of the warehouse. Next, using machine learning models, IWM predicts if the incoming query has the necessary compute available on the existing compute. If it doesn't have the compute needed, then the query is added to the queue. If it does have the compute needed, the query begins running immediately.

IWM monitors the queue in real time. If the queue is not decreasing quickly enough, autoscaling automatically provisions more compute. After new capacity is added, queued queries are admitted to the new compute resources. With serverless SQL warehouses, new compute can be added rapidly. The maximum number of queries in a queue for all SQL warehouse types is 1000.

Sizing a serverless SQL warehouse

Start with a larger size for your serverless SQL warehouse than you think you will need and size down as you test. Don't start with a small size for your serverless SQL warehouse and go up. In general, start with a single serverless SQL warehouse and rely on Databricks to right-size with serverless clusters, prioritizing workloads, and fast data reads. See Serverless autoscaling and query queuing.

To decrease query latency for a given serverless SQL warehouse:
- If queries are spilling to disk, increase the t-shirt size.
- If the queries are highly parallelizable, increase the t-shirt size.
- If you are running multiple queries at a time, add more clusters for autoscaling.
To reduce costs, try to step down in size without spilling to disk or significantly increasing latency.

Tools to monitor and evaluate performance

To help right-size your SQL warehouse, use the following tools:

Monitoring page: Review the peak query count. If the peak queued is commonly above one, add clusters. The maximum number of queries in a queue for all SQL warehouse types is 1000. See Monitor a SQL warehouse.
Query history. See Query history.
Query profiles (look for Bytes spilled to disk above 1). See Query profile.

Sizing overview​

Cluster sizes for SQL warehouses​

Availability zones (AZ) for pro and classic SQL warehouses​

Queueing and autoscaling for pro and classic SQL warehouses​

Serverless autoscaling and query queuing​

Sizing a serverless SQL warehouse​

Tools to monitor and evaluate performance​