Enable serverless SQL warehouses
With serverless compute, the compute layer exists in your Databricks account rather than your AWS account. Serverless compute is supported for use with Databricks SQL. Admins can create serverless SQL warehouses that enable instant compute and are managed by Databricks. Serverless SQL warehouses use compute resources in your Databricks account. Use serverless SQL warehouses with Databricks SQL queries the same way you would with the original customer-hosted SQL warehouses, which are now called classic SQL warehouses.
If your workspace satisfies the serverless SQL warehouse requirements:
New SQL warehouses are serverless by default when created from the UI, but you can also create new pro and classic SQL warehouses.
From the SQL Warehouses API or Terraform, Databricks recommends always explicitly set
enable_serverless_compute
totrue
andwarehouse_type
topro
to create serverless SQL warehouses. For details about the defaults if omitted, see SQL Warehouses API and Terraform.You can create serverless SQL warehouses with the UI or API, or convert warehouses to serverless.
This feature only affects Databricks SQL. It does not affect how Databricks Runtime clusters work with notebooks and jobs in the Data Science & Engineering or Databricks Machine Learning workspace environments.
Serverless SQL warehouses do not have public IP addresses. For more architectural information, see Serverless compute.
Depending on when your Databricks account was created, to create serverless SQL warehouses, your organization may need to perform several tasks:
Task |
Who can do this step? |
Where is this done? |
---|---|---|
If your account needs updated terms of use, workspace admins are prompted in the Databricks SQL UI. |
Account owner or account administrator. |
The account console’s settings page. |
If your workspace uses an instance profile for Databricks SQL, you may need to update its role to add a trust relationship. |
Workspace administrator must confirm which instance profile your workspace uses for Databricks SQL. An AWS administrator with permissions to view and make changes to AWS IAM policies must check the role’s trust relationship policy or make any necessary changes. |
The workspace’s Databricks SQL settings page and the AWS console. |
This article describes how to perform these steps. If you are not able to perform all of these roles (for example, you are a workspace admin but not an account admin or you do not have access to your AWS IAM roles), you may need to contact others in your organization to perform some steps.
If acceptance of terms is required for your account to enable serverless compute, any workspace admin that uses Databricks SQL sees a banner on the top of each page to indicate that your account admin must accept the terms of use. If the workspace admin is not an account admin, contact your account admin to do this step.
Note
Databricks changed the name from SQL endpoint to SQL warehouse because, in the industry, endpoint refers to either a remote computing device that communicates with a network that it’s connected to, or an entry point to a cloud service. A data warehouse is a data management system that stores current and historical data from multiple sources in a business friendly manner for easier insights and reporting. SQL warehouse accurately describes the full capabilities of this compute resource.
Requirements
Account requirements:
Your Databricks account must be on the E2 version of the platform.
Your Databricks workspace must be on the Premium or higher pricing tier.
Your Databricks account must not be on a free trial.
Your Databricks account must not have the compliance security profile enabled at the account level.
Workspace requirements:
Your Databricks account must not have the compliance security profile enabled at the workspace level for any workspaces that you intend to use with any Serverless compute features such as <ServerlessWarehouses>.
Your workspace must not use an external Hive legacy metastore.
Your workspace must not use S3 access policies.
Other feature interactions:
Cluster policies, including spot instance policies are unsupported.
Customer-managed VPCs are not applicable to compute resources for serverless SQL warehouses.
Although the Serverless data plane for serverless SQL warehouses does not use the customer-configurable AWS PrivateLink connectivity for the Classic data plane, it does use private connectivity to connect to the Databricks control plane.
Although the serverless data plane does not use the secure cluster connectivity relay for the classic data plane, serverless SQL warehouses do not have public IP addresses.
Serverless SQL warehouses do not use customer-managed keys for EBS storage encryption, which is an optional part of the customer-managed keys for workspace storage feature configuration. Disks for serverless compute resources are short-lived and tied to the lifecycle of the serverless workload. For example, when serverless SQL warehouses are stopped or scaled down, the VMs and their storage are destroyed. See Serverless compute and customer-managed keys.
For a list of regions that support serverless SQL warehouses, see Databricks clouds and regions.
Also note that the Databricks documentation on cluster size instance types and CPU quotas applies only to pro and classic SQL warehouses, not to serverless SQL warehouses.
Serverless quotas
Serverless quotas are a safety measure for serverless compute. Serverless quotas restrict how many serverless compute resources a customer can have at any given time. The quota is enforced at the regional level for all workspaces in your account. Quotas are enforced only for serverless SQL warehouses. See Serverless quotas.
Step 1: If prompted, accept updated account terms of use
Important
If your account needs updated terms of use, workspace admins are prompted in the Databricks SQL UI. If you are a workspace admin and you do not see a yellow notification when using Databricks SQL, you can skip this step.
If you are not an account owner or account administrator, you cannot perform this step. Contact the account owner or an account administrator before continuing to the next steps in this article.
As an account owner or account administrator, go to the feature enablement tab of the account console settings page.
Next to Enable use of serverless compute, click the blue button Enable.
If the blue button does not appear but there is text that says Enabled, this step is already complete. Continue to Step 2: Confirm or set up an AWS instance profile to use with your serverless SQL warehouses.
A pop-up appears about agreeing to applicable terms of use. Click the link to open the applicable terms in a new browser tab. When complete, return to the original tab and click the Enable button in the pop-up.
Step 2: Confirm or set up an AWS instance profile to use with your serverless SQL warehouses
An instance profile is a container for an IAM role that you can use to pass role information to an EC2 instance when the instance starts. You can optionally configure an AWS instance profile for Databricks SQL to connect to AWS S3 buckets other than your root bucket.
If you already use an instance profile with Databricks SQL, the role associated with the instance profile needs a Databricks Serverless compute trust relationship statement so that serverless SQL warehouses can use it.
Depending on how and when your instance profile was created, you might not need to modify the role because it may already have the trust relationship. If the instance profile was created in the following ways, it likely has the trust relationship statement:
After June 24, 2022, your instance profile was created as part of creating a Databricks workspace by using AWS Quickstart.
After June 24, 2022, someone in your organization followed steps in the Databricks article to create the instance profile manually.
This section describes how to confirm or update that the role associated with the instance profile has the trust relationship statement. That enables your serverless SQL warehouses to use the role to access your S3 buckets.
Important
To perform these steps, you must be a Databricks workspace admin to confirm which instance profile your workspace uses for Databricks SQL. You must also be an AWS account administrator to check the role’s trust relationship policy or make any necessary changes. If you are not both of these types of admin, contact the appropriate admins in your organization to complete these steps.
In the admin console, click the SQL Warehouse Settings tab.
Look in the Data Security section for the Instance Profile field. Confirm whether your workspace is configured to use an AWS instance profile for Databricks SQL to connect to AWS S3 buckets other than your root bucket.
If you are using an instance profile, its name is visible in the Instance Profile field. Make a note of it for the next step.
If the field value is None, you are not using an instance profile to access S3 buckets other than your workspace’s root bucket. Setup is complete. Skip to Step 3.
Confirm whether your instance profile name matches the associated role name.
In the AWS console, go to the IAM service’s Roles tab. It lists all the IAM roles in your account.
Click the role with the name that matches the instance profile name in the Databricks SQL admin settings in the Data Security section for the Instance Profile field that you found earlier in this section.
In the summary area, find the Role ARN and Instance Profile ARNs fields.
Check if the last part of those two fields have matching names after the final slash. For example:
If you determined in the previous step that the role name (the text after the last slash in the role ARN) and the instance profile name (the text after the last slash in the instance profile ARN) do not match, edit your instance profile registration to specify your IAM role ARN.
To edit your instance profiles, look below the Instance profile field and click the Configure button.
Click your instance profile’s name.
Click Edit.
In the optional Role ARN field, paste the role ARN for the role associated with your instance profile. This is the key step that allows your instance profile to work with Databricks SQL Serverless even if the role name does not match the instance profile name.
Click Save.
Within the AWS console, confirm or edit the trust relationship.
In the AWS console IAM service’s Roles tab, click the instance profile role that you want to modify.
Click the Trust relationships tab.
View the existing trust policy. If the policy already includes the JSON block below, then this step was completed at an earlier time and you can ignore the following instructions.
Click Edit trust policy.
Within the existing
Statement
array, append the following JSON block to the end of the existing trust policy. Ensure that you don’t overwrite the existing policy.{ "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::790110701330:role/serverless-customer-resource-role" ] }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": [ "databricks-serverless-<YOUR_WORKSPACE_ID1>", "databricks-serverless-<YOUR_WORKSPACE_ID2>" ] } } }
The only thing you need to change in the statement is the workspace ID. Replace the
YOUR_WORKSPACE-ID
with one or more Databricks workspace IDs for the workspaces that will use this role. To get your workspace ID while you are using your workspace, check the URL. For example, inhttps://<databricks-instance>/?o=6280049833385130
, the number aftero=
is the workspace ID.Do not edit the principal of the policy. The
Principal.AWS
field must continue to have the valuearn:aws:iam::790110701330:role/serverless-customer-resource-role
. This references a serverless compute role managed by Databricks.Click Review policy.
Click Save changes.
Important
If your instance profile changes at a later time, repeat these steps to ensure that the trust relationship for the instance profile’s role contains the required extra statement.
Step 3: Test usage of serverless SQL warehouses
Create or convert a warehouse:
Create a new serverless SQL warehouse using the SQL warehouse UI. Note that by default, new SQL warehouses are serverless in the UI if the workspace complies with the requirements.
Create a new serverless SQL warehouse using the REST API. By default, new SQL warehouses are not serverless when created in the API. You must explicitly specify serverless. Be sure to also set warehouse type to
pro
in addition to settingenable_serverless_compute
to true. If omitted, the default isfalse
for most workspaces. However, if this workspace used the SQL Warehouses API to create a warehouse between September 1, 2022 and April 30, 2023, the default remains the previous behavior which is default totrue
if the workspace is enabled for serverless and fits the requirements for serverless SQL warehouses. To avoid ambiguity, especially for organizations with many workspaces, Databricks recommends that you always set this field.Create a new serverless SQL warehouse using Terraform. By default, new SQL warehouses are not serverless when created in the API. You must explicitly specify serverless. If omitted, the default is
false
for most workspaces. However, if this workspace used the SQL Warehouses API to create a warehouse between September 1, 2022 and April 30, 2023, the default remains the previous behavior which is default totrue
if the workspace is enabled for serverless and fits the requirements for serverless SQL warehouses. To avoid ambiguity, especially for organizations with many workspaces, Databricks recommends that you always set this field.Upgrade a pro or classic SQL warehouse to a serverless SQL warehouse.
Run a query with your new serverless SQL warehouse.
Troubleshooting
If your trust relationship is misconfigured, clusters fail with a message that says “Request to create a cluster failed with an exception INVALID_PARAMETER_VALUE: IAM role <role-id> does not have the required trust relationship.”
If you get this error, it could be that the workspace IDs were incorrect or possibly that that the trust policy was not updated correctly on the correct role.
Carefully perform the steps in Step 2: Confirm or set up an AWS instance profile to use with your serverless SQL warehouses to update the trust relationship.
Configuring Glue metastore for serverless SQL warehouses
If you need to specify an AWS Glue metastore or add additional data source configurations, update the Data Access Configuration field in the admin console. See Data access configuration.
Important
Serverless SQL warehouses support the default Databricks metastore and AWS Glue as a metastore, but do not support external Hive metastores.