SQL warehouse sizing, scaling, and queuing behavior

This article explains the cluster sizing, queuing, and autoscaling behavior of SQL warehouses.

Note

This article mostly applies to pro and classic SQL warehouses. For information on serverless SQL warehouse behavior, see Serverless compute.

Cluster size

The table in this section maps SQL warehouse cluster sizes to Databricks cluster driver size and worker counts. The driver size only applies to pro and classic SQL warehouses.

Note

For serverless SQL warehouses, the cluster sizes may in some cases use different instance types than the ones listed in the documentation for pro and classic SQL warehouses for an equivalent cluster size. In general, the price/performance ratio of the cluster sizes for serverless SQL warehouses is similar to those for pro and classic SQL warehouses.

Cluster size

Instance type for driver (applies only to pro and classic SQL warehouses)

Worker count

2X-Small

i3.2xlarge

1 x i3.2xlarge

X-Small

i3.2xlarge

2 x i3.2xlarge

Small

i3.4xlarge

4 x i3.2xlarge

Medium

i3.8xlarge

8 x i3.2xlarge

Large

i3.8xlarge

16 x i3.2xlarge

X-Large

i3.16xlarge

32 x i3.2xlarge

2X-Large

i3.16xlarge

64 x i3.2xlarge

3X-Large

i3.16xlarge

128 x i3.2xlarge

4X-Large

i3.16xlarge

256 x i3.2xlarge

The instance size of all workers is i3.2xlarge.

Availability zones (AZ)

For SQL warehouses, AWS availability zones are set to auto (Auto-AZ), where the AZ is automatically selected based on available IPs in the workspace subnets. Auto-AZ retries in other availability zones if AWS returns insufficient capacity errors. For more about availability zones, see the AWS documentation.

Queueing and autoscaling

Databricks limits the number of queries on a cluster assigned to a SQL warehouse based on the cost to compute their results. Upscaling of clusters per warehouse is based on query throughput, the rate of incoming queries, and the queue size.

Databricks adds clusters based on the time it would take to process all currently running queries, all queued queries, and the incoming queries expected in the next two minutes.

  • If less than 2 minutes, don’t upscale.

  • If 2 to 6 minutes, add 1 cluster.

  • If 6 to 12 minutes, add 2 clusters.

  • If 12 to 22 minutes, add 3 clusters.

Otherwise, Databricks adds 3 clusters plus 1 cluster for every additional 15 minutes of expected query load.

In addition, a warehouse is always upscaled if a query waits for 5 minutes in the queue.

If the load is low for 15 minutes, Databricks downscales the SQL warehouse. It keeps enough clusters to handle the peak load over the last 15 minutes. For example, if the peak load was 25 concurrent queries, Databricks keeps 3 clusters.

Query queuing

Databricks queues queries when all clusters assigned to the warehouse are executing queries at full capacity or when the warehouse is in the STARTING state.

Metadata queries (for example, DESCRIBE <table>) and state modifying queries (for example SET) are never queued, unless the warehouse is in the STARTING state.

Query routing

Databricks routes queries as follows:

  • New session: to the cluster with the least load.

  • Existing session: to the cluster that ran the previous query for that session. If that cluster does not have available capacity, the query is routed to the cluster with the least load.