Enable serverless SQL warehouses

Preview

Serverless SQL warehouses are available in Public Preview.

With the serverless compute version of the Databricks platform architecture, the compute layer exists in the Databricks cloud subscription rather than the customer’s cloud subscription. Serverless compute is supported for use with Databricks SQL. Admins can create serverless SQL warehouses that enable instant compute and are managed by Databricks. Serverless SQL warehouses use compute clusters in the Databricks AWS account. Use serverless SQL warehouses with Databricks SQL queries just like you normally would with the original customer-hosted SQL warehouses that are now called classic SQL warehouses.

If serverless SQL warehouses are enabled for your workspace:

This feature only affects Databricks SQL. It does not affect how Databricks Runtime clusters work with notebooks and jobs in the Data Science & Engineering or Databricks Machine Learning workspace environments.

Serverless SQL warehouses do not have public IP addresses. For more architectural information, see Serverless compute.

Before you can create serverless SQL warehouses, your organization must perform several main tasks:

Task

Who can do this step?

Where is this done?

Enable use of Serverless compute for your account.

Account owner or account administrator.

The account console’s settings page.

Enable one or more workspaces for Serverless SQL warehouses.

Workspace administrator

The SQL admin console’s settings page.

If your workspace uses an instance profile for Databricks SQL, you may need to update its role to add a trust relationship.

Workspace administrator (to confirm which instance profile your workspace uses for Databricks SQL) and an AWS administrator with permissions to view and make changes to AWS IAM policies (to check the role’s trust relationship policy or make any necessary changes).

The workspace’s Databricks SQL settings page and the AWS console.

This article describes how to perform these steps. If you are not able to perform all of these roles (for example, you are a workspace admin but not an account admin or you do not have access to your AWS IAM roles), you may need to contact others in your organization to perform some steps.

Databricks changed the name from SQL endpoint to SQL warehouse because, in the industry, endpoint refers to either a remote computing device that communicates with a network that it’s connected to, or an entry point to a cloud service. A data warehouse is a data management system that stores current and historical data from multiple sources in a business friendly manner for easier insights and reporting. SQL warehouse accurately describes the full capabilities of this compute resource.

Requirements

Also note that the Databricks documentation on cluster size instance types and CPU quotas apply only to pro and classic SQL warehouses, not to serverless SQL warehouses.

Also note that the Databricks documentation on cluster size instance types and CPU quotas apply only to pro and classic SQL warehouses, not to serverless SQL warehouses.

Step 1: Enable use of Serverless compute for your account

Before you can enable Serverless SQL warehouses at the workspace level, your organization’s owner or account administrator must enable Serverless compute. This is a one-time step.

Note

If you are not an account owner or account administrator, you cannot perform this step. Contact the account owner or an account administrator before continuing to the next steps in this article.

  1. As an account owner or account administrator, go to the feature enablement tab of the account console’s settings page.

  2. Next to Enable use of Serverless compute, click the blue button Enable.

    Feature enablement tab in account console

    If the blue button does not appear but there is text that says Enabled, this step is already complete. Continue to Step 2: Enable serverless SQL warehouses for a workspace.

    Already enabled
  3. A pop-up appears about agreeing to applicable terms of use. Click the link to open the applicable terms in a new browser tab. When complete, return to the original tab and click the Enable button in the pop-up.

    Feature enablement tab in account console

Step 2: Enable serverless SQL warehouses for a workspace

  1. As a Databricks workspace administrator, go to the SQL admin console in Databricks SQL.

    Note

    If you are not a workspace administrator, you cannot perform this step. Contact the workspace administrator to request that they enable Serverless SQL warehouses.

    If you are in the Data Science & Engineering or Databricks Machine Learning workspace environment, you may need to select SQL from the sidebar. Click the icon below the Databricks logo.

    SQL persona

    Once you are in Databricks SQL, click User Settings Icon Settings at the bottom of the sidebar and select SQL Admin Console.

    SQL Admin Console

    If you do not see the SQL Admin Console menu item, your user account is not an admin for this workspace.

  2. In the SQL admin console, click the SQL Warehouse settings tab.

    SQL warehouse settings tab
  3. Select Serverless SQL Warehouses.

    Enable Serverless Compute

    If you do not see the Serverless SQL Warehouses option:

    • It is likely that the terms of use for your account have not been accepted by your account owner or an account administrator. See Step 1: Enable use of Serverless compute for your account.

    • It is possible that your account has restrictions that prevent enabling this feature, for example it is not on the E2 version of the platform, your account is still on a free trial, or the workspace uses the compliance security profile. See Requirements. If you have questions, contact your Databricks representative.

  4. Scroll down to the bottom of the page and click Save Changes.

    Serverless Save Changes

    Important

    Be careful to click Save Changes before navigating to another page or the change won’t take effect.

Step 3: Confirm or set up an AWS instance profile to use with your serverless SQL warehouses

An instance profile is a container for an IAM role that you can use to pass role information to an EC2 instance when the instance starts. You can optionally configure an AWS instance profile for Databricks SQL to connect to AWS S3 buckets other than your root bucket.

If you already use an instance profile with Databricks SQL, the role associated with the instance profile needs a Databricks Serverless compute trust relationship statement so that Serverless SQL warehouses can use it.

Depending on how and when your instance profile was created, you might not need to modify the role because it may already have the trust relationship. If the instance profile was created as part of the Databricks workspace creation as part of AWS Quickstart after June 24, 2022, your instance profile’s role already has this change. Similarly, if whoever in your organization created the instance profile manually using the Databricks article on creating the instance profile after June 24, 2022, it likely has this trust relationship statement already.

This section describes how to confirm or update that the role associated with the instance profile has the trust relationship statement. That enables your Serverless SQL warehouses to use the role to access your S3 buckets.

Important

To perform these steps, you must be both a Databricks workspace admin (to confirm which instance profile your workspace uses for Databricks SQL) and an AWS account administrator (to check the role’s trust relationship policy or make any necessary changes).

  1. If you are not already viewing the SQL admin console settings page because you followed the steps in the previous section, navigate to it now.

    1. As a Databricks workspace administrator, go to the SQL admin console in Databricks SQL. If you are in the Data Science & Engineering or Databricks Machine Learning workspace environment, you may need to select SQL from the sidebar. Click the icon below the Databricks logo.

      1. Once you are in Databricks SQL, click User Settings Icon Settings at the bottom of the sidebar and select SQL Admin Console.

        SQL Admin Console
    2. In the SQL admin console, click the SQL Warehouse Settings tab.

      SQL warehouse settings tab
  2. Look in the Data Security section for the Instance Profile field. Confirm whether your workspace is configured to use an AWS instance profile for Databricks SQL to connect to AWS S3 buckets other than your root bucket.

    • If you are using an instance profile, its name is visible in the Instance Profile field. Make a note of it for the next step.

      Instance profile picker
    • If the field value is None, you are not using an instance profile to access S3 buckets other than your workspace’s root bucket. Setup is complete. Skip to step 4.

      Instance profile picker none
  3. Confirm whether your instance profile name matches the associated role name.

    1. In the AWS console, go to the IAM service’s Roles tab. It lists all the IAM roles in your account.

    2. Click the role with the name that matches the instance profile name in the Databricks SQL admin settings in the Data Security section for the Instance Profile field that you found earlier in this section.

    3. In the summary area, find the Role ARN and Instance Profile ARNs fields.

    4. Check if the last part of those two fields have matching names after the final slash. For example:

    Does instance profile name and role arn name match
  4. If you determined in the previous step that the role name (the text after the last slash in the role ARN) and the instance profile name (the text after the last slash in the instance profile ARN) do not match, edit your instance profile registration to specify your IAM role ARN.

    1. To edit your instance profiles, look below the Instance profile field and click the Configure button.

    2. Click your instance profile’s name.

    3. Click Edit.

      Edit instance profile Role ARN
    4. In the optional Role ARN field, paste the role ARN for the role associated with your instance profile. This is the key step that allows your instance profile to work with Databricks SQL Serverless even if the role name does not match the instance profile name.

    5. Click Save.

  5. Within the AWS console, confirm or edit the trust relationship.

    1. In the AWS console IAM service’s Roles tab, click the role that you want to modify.

    2. Click the Trust relationships tab.

      Trust relationships tab
    3. Click Edit trust policy.

      Edit trust policy button
    4. Edit the trust policy JSON. Within the Statement array, add the following statement (a JSON block) to your role’s trust policy.

      Use the following text but replace the sts:ExternalId array with one or more Databricks workspace IDs that represent the workspaces that will use this role.

      To get your workspace ID while you are using your workspace, check the URL. For example, in https://<databricks-instance>/?o=6280049833385130, the number after o= is the Databricks workspace ID. In that case, the workspace ID is 6280049833385130. To complete this step, you must find and copy the workspace ID number for each of your workspaces.

      Workspace ID

      Important

      Do not change the principal of the policy. Use the exact value that is in the following policy statement. It is critical that the Principal.AWS field continue to have the value arn:aws:iam::790110701330:role/serverless-customer-resource-role in your trust statement. It references a Serverless compute role that is managed by Databricks. Do not change this value. The only thing you need to change in the new statement is the list of workspace IDs.

      {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::790110701330:role/serverless-customer-resource-role"
        ]
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
            "sts:ExternalId": [
              "databricks-serverless-<YOUR_WORKSPACE_ID1>",
              "databricks-serverless-<YOUR_WORKSPACE_ID2>"
            ]
          }
        }
      }
      

      For example:

      Trust relationship policy
    5. Click Update Trust Policy.

Important

If your instance profile changes at a later time, repeat these steps to ensure that the trust relationship for the instance profile’s role contains the required extra statement.

Step 4: Test usage of serverless SQL warehouses

  1. Create or convert a warehouse:

  2. Run a query with your new serverless SQL warehouse.

Troubleshooting

If your trust relationship is misconfigured, clusters fail with a message that says “Request to create a cluster failed with an exception INVALID_PARAMETER_VALUE: IAM role <role-id> does not have the required trust relationship.”

trust relationship failed

If you get this error, it could be that the workspace IDs were incorrect or possibly that that the trust policy was not updated correctly on the correct role.

Carefully perform the steps in Step 3: Confirm or set up an AWS instance profile to use with your serverless SQL warehouses to update the trust relationship.

Configuring Glue metastore for serverless SQL warehouses

If you need to specify an AWS Glue metastore or add additional data source configurations, update the Data Access Configuration field in the SQL Admin console. See Data access configuration.

Important

Serverless SQL warehouses support the default Databricks metastore and AWS Glue as a metastore, but do not support external Hive metastores.