SQL endpoints

A SQL endpoint is a computation resource that lets you run SQL commands on data objects within Databricks SQL. This article introduces SQL endpoints and describes how to work with them using the Databricks SQL UI. A SQL endpoint is a type of Databricks compute resource. Other compute resource types include Databricks clusters. To work with SQL endpoints using the API, see SQL Endpoints APIs.

Before you begin

When you create your first SQL endpoints, Databricks recommends that you accept the defaults as they appear on the New SQL endpoint page. But you have many options that you can configure to meet your specific needs. Among those options, you should be aware of:

  • Serverless SQL endpoints (Public Preview), which use compute resources that are managed by Databricks, unlike a Classic SQL endpoint, whose compute resources are in your AWS account. Serverless SQL endpoints simplify SQL endpoint management and accelerate launch times. See Enable Serverless SQL endpoints.
  • Channels, which let you choose whether to use the current SQL endpoint compute version or the preview version. Preview versions let you try out functionality before it becomes the Databricks SQL standard. Take advantage of preview versions to test your queries and dashboards against upcoming changes. Typically, preview versions are promoted to the current version two weeks after initial preview release, but some previews may last longer. You can learn about the features in the latest preview version by reviewing the release notes. Databricks does not recommend using preview versions for production workloads.

Requirements

  • To create a SQL endpoint you must have cluster creation permission in Databricks Data Science & Engineering.
  • To manage a SQL endpoint you must have Can Manage permission in Databricks SQL.

View SQL endpoints

Click Endpoints Icon SQL Endpoints in the sidebar.

By default, endpoints are sorted by state (running endpoints first), then in alphabetical order. You can reorder the list by clicking the column headings.

To help you get started quickly, such as when you follow the Databricks SQL quickstart, Databricks creates a small SQL endpoint called Starter Endpoint automatically. You can edit or delete this SQL endpoint.

To filter the list of endpoints, enter text in the search box:

Filter endpoints

To see more about an endpoint’s configuration, including connection details, monitoring, and advanced settings, click the name of the endpoint.

Endpoint details

Create a SQL endpoint

You can create a SQL endpoint using the New SQL Endpoint page in the web UI or using the SQL Endpoint API.

By accepting all the defaults on the New SQL Endpoint page, you can create an efficient and high-performing SQL endpoint quickly and easily. You can override those defaults if your workload or environment requires it.

To create a SQL endpoint using the web UI:

  1. Click Endpoints Icon-1 SQL Endpoints in the sidebar.

  2. Click Create SQL Endpoint to open the New SQL Endpoint dialog.

    Create endpoint
  3. Enter a name for the endpoint.

  4. Accept the default endpoint properties or edit them.

    • Cluster Size represents the number of cluster workers and size of compute resources available to run your queries and dashboards. The default is X-Large. To reduce query latency, increase the size. For details, see Cluster size.

    • Auto Stop determines whether the endpoint stops if it’s idle for the specified number of minutes. The default is 10 minutes. If you disable Serverless under Advanced options, 120 minutes is recommended.

      Idle SQL endpoints continue to accumulate DBU and cloud instance charges until they are stopped.

    • Scaling sets the minimum and maximum number of clusters over which queries sent to the endpoint are distributed.

      The default is a minimum of one and maximum of one cluster.

      To handle more concurrent users for a given query, increase the cluster count. Databricks recommends a cluster for every ten concurrent queries. To learn how Databricks adds clusters to and removes clusters from an endpoint, see Queueing and autoscaling.

  5. (Optional) Configure advanced options.

    If you want to do any of the following, expand Advanced options:

    • Add tags to help monitor SQL endpoint usage
    • Turn off the Photon performance engine
    • Turn off the Serverless option
    • Use the Preview channel to test upcoming features

    For details, see Advanced options.

    Note

    If Serverless is not enabled for your workspace, you do not see that option.

  6. Click Create.

  7. Do one of the following:

    The endpoint is created and started.

Advanced options

You can configure the following advanced options by expanding the Advanced options area when you create a SQL endpoint using the New SQL endpoint dialog or edit an existing SQL endpoint using the edit dialog. You can also configure these options using the SQL Endpoints APIs.

Configure advanced options

Add tags for usage monitoring

Tags allow you to easily monitor the cost of cloud resources used by users and groups in your organization. When you create or edit a SQL endpoint, expand the Advanced options area to specify tags as key-value pairs. Databricks applies these tags to cloud resources.

Manage performance using the Photon option

Keep Photon on to ensure that queries are executed on the Photon native vectorized engine that speeds up query execution. Databricks recommends against disabling Photon and plans to remove the off option in a future release. However, there may be circumstances in which a Databricks support representative recommends that you disable it. In that case, expand Advanced options to turn Photon off.

Switch the SQL endpoint type (Classic or Serverless).

Classic SQL endpoint endpoints use compute resources in your AWS account. Serverless SQL endpoints (Public Preview) use compute resources in the Databricks cloud account. Serverless SQL endpoints simplify SQL endpoint configuration and usage and accelerate launch times. The Serverless option is available only if it has been enabled for the workspace. For more information, see Enable Serverless SQL endpoints and Serverless compute.

  • If Serverless SQL endpoints are enabled for your workspace, they are the default whenever you create a SQL endpoint. If you prefer a Classic SQL endpoint, expand Advanced options and turn Serverless off. To switch back to Serverless, turn the toggle on.
  • If Serverless SQL endpoints are not enabled, the SQL endpoint is always a Classic SQL endpoint. The Serverless toggle does not appear, and you cannot change the endpoint type.

Note

If you turn off Serverless, the Spot Instance Policy option appears, with the default value Cost Optimized.

Configure spot instance policy (Classic SQL endpoints only)

If Serverless SQL endpoints are not enabled for your workspace, or if you turn off the Serverless option in Advanced options, the Spot Instance Policy option appears, with the default value Cost Optimized.

The spot instance policy determines whether workers use only on-demand instances or a combination of on-demand and spot instances. Cost Optimized uses mostly spot instances and one on-demand instance. Reliability Optimized uses only on-demand instances.

Use the preview channel

Channels let you choose whether to use the Current SQL endpoint compute version or the Preview version. A preview version let you try out functionality before it becomes the Databricks SQL standard. Take advantage of the preview channel to test your queries and dashboards against upcoming changes. You can learn about what’s in the latest preview version in the release notes.

Current is the default. To switch a SQL endpoint to the preview channel, expand Advanced options and select Preview.

Important

Databricks recommends against using a preview version for production workloads. Because only administrators can view an endpoint’s properties, including its channel, consider indicating that an SQL endpoint uses a preview version in the endpoint’s name so that users do not inadvertently use it for production workloads.

Start, stop, or delete a SQL endpoint

  1. Click Endpoints Icon-2 SQL Endpoints in the sidebar.
  2. To stop a running endpoint, click Stop.
  3. To start a stopped endpoint, click Start.
  4. To delete an endpoint, click the vertical ellipsis Vertical Ellipsis in the Actions column, then click Delete.

See also Configure an instance profile.

Edit a SQL endpoint

You can choose to edit a SQL endpoint using the web UI or the SQL Endpoint API.

To edit a SQL endpoint using the web UI:

  1. Click Endpoints Icon-3 SQL Endpoints in the sidebar.

  2. In the Actions column, click the vertical ellipsis Vertical Ellipsis and click Edit.

  3. Edit the endpoint properties.

    For information about each editable property, see Create a SQL endpoint and Advanced options.

  4. Click Save or Save and restart.

Convert a Classic SQL endpoint to a Serverless SQL endpoint

To learn about Classic and Serverless SQL endpoints, see Enable Serverless SQL endpoints.

Preview

This feature is in Public Preview. To sign up for access, see Request Access to Serverless SQL.

  1. Click Endpoints Icon SQL Endpoints in the sidebar.
  2. Click a Classic SQL endpoint.
  3. Click the Edit button.
  4. Click the Serverless toggle to change its type.
  5. Click Save or Save and restart.

Configure SQL endpoint permissions

To configure permissions for a SQL endpoint:

  1. Click Endpoints Icon-4 SQL Endpoints in the sidebar.

  2. Click an endpoint.

  3. Click the Permissions Button button.

    The SQL endpoint Permissions dialog appears. The endpoint creator and Databricks admins have Can Manage permission by default.

    Add permission
  4. Select a user or group and a permission.

  5. Click Add.

  6. Click Save.

To learn about permission levels, see SQL endpoint access control.

Monitor a SQL endpoint

You can view the number of queries handled by the endpoint and the number of clusters allocated to the endpoint.

  1. Click Endpoints Icon-5 SQL Endpoints in the sidebar.

  2. Click an endpoint.

  3. Click Monitoring.

    A chart showing the number of queries handled by the endpoint and the number of clusters allocated to the endpoint over the last 6 hours displays.

    Click a timescale button at the top right of the chart to change the displayed period. For example, the following screenshot shows these statistics over 7 days:

    Monitor endpoint

    Note

    The Cluster Count can be greater than one only if scaling is enabled and configured.

Cluster size

The table in this section maps SQL endpoint cluster sizes to Databricks cluster driver size and worker counts.

Cluster size Driver size Worker count
2X-Small i3.2xlarge 1
X-Small i3.2xlarge 2
Small i3.4xlarge 4
Medium i3.8xlarge 8
Large i3.8xlarge 16
X-Large i3.16xlarge 32
2X-Large i3.16xlarge 64
3X-Large i3.16xlarge 128
4X-Large i3.16xlarge 256

The instance size of all workers is i3.2xlarge.

Queueing and autoscaling

Databricks limits the number of queries on a cluster assigned to a SQL endpoint to 10. Upscaling of clusters per endpoint is based on query throughput, the rate of incoming queries, and the queue size. Databricks adds clusters based on the time it would take to process all currently running queries, all queued queries, and the incoming queries expected in the next two minutes as follows:

  • Less than 2 minutes, don’t upscale.
  • 2 to 6 minutes, add 1 cluster.
  • 6 to 12 minutes, add 2 clusters.
  • 12 to 22 minutes, add 3 clusters.

Otherwise, Databricks adds 3 clusters plus 1 cluster for every additional 15 minutes of expected query load.

In addition, an endpoint is always upscaled if a query waits for 5 minutes in the queue.

If the load is low for 15 minutes, Databricks downscales the SQL endpoint. It keeps enough clusters to handle the peak load over the last 15 minutes. For example, if the peak load was 25 concurrent queries, Databricks keeps 3 clusters.

Query queuing

Databricks queues queries when all clusters assigned to the endpoint are currently executing 10 queries or when the endpoint is in the STARTING state.

Metadata queries (for example, DESCRIBE <table>) and state modifying queries (for example SET) are never queued, unless the endpoint is in the STARTING state.

Query routing

Databricks routes queries as follows:

  • New session: to the cluster with the least load.
  • Existing session: to the cluster that ran the previous query for that session. If that cluster does not have available capacity, the query is routed to the cluster with the least load.

Query Watchdog and SQL endpoints

Queries made through SQL endpoints are managed by Query Watchdog, a process that prevents queries from monopolizing compute resources by examining the most common causes of large queries and terminating queries that pass a threshold. For an explanation of the defaults used by Query Watchdog, see Handling large queries in interactive workflows.

Enable Serverless SQL endpoints

Preview

This feature is in Public Preview. To sign up for access, see Request Access to Serverless SQL.

Classic SQL endpoints use compute resources in your AWS account. Serverless SQL endpoints (Public Preview) use compute resources in the Databricks cloud account. Serverless SQL endpoints simplify SQL endpoint configuration and usage and accelerate launch times. You can use them to run Databricks SQL queries just like you do with a Classic SQL endpoint.

Databricks supports Serverless SQL endpoints in AWS regions eu-west-1, ap-southeast-2, us-east-1, and us-west-2. See Supported Databricks regions.

You can create new Serverless SQL endpoints using the SQL endpoints UI or the SQL Endpoints API. You can convert a Classic SQL endpoint to a Serverless SQL endpoint. You can have Serverless SQL endpoints and Classic SQL endpoints running at the same time.

For a more thorough comparison of Serverless and Classic SQL endpoints, see Serverless compute.

Before you can create Serverless SQL endpoints, a Databricks workspace administrator must enable Serverless SQL Endpoints for your workspace. This section describes how.

Enable Serverless SQL endpoints for a workspace

  1. Request access to the preview on the Request Access to Serverless SQL page.

    You will be asked for your workspace ID. To learn how to get your workspace ID, see Workspace instance names, URLs, and IDs.

    Important

    Do not proceed with SQL endpoint enablement until you get your confirmation email. It may take 1-3 workdays for confirmation.

  2. Enable Serverless compute for your workspace.

    1. As a Databricks workspace administrator, go to the SQL admin console in Databricks SQL.

      If you are in the Data Science & Engineering or Databricks Machine Learning workspace environment, you may need to select SQL from the sidebar. Click the icon below the Databricks logo.

      Once you are in Databricks SQL, click User Settings Icon Settings at the bottom of the sidebar and select SQL Admin Console.

      SQL Admin Console

      If you do not see the SQL Admin Console menu item, your user account is not an admin for this workspace.

    2. In the SQL admin console, click the SQL Endpoint Settings tab.

    3. Select Serverless SQL Endpoints.

      Enable Serverless Compute
    4. If this is the first time this feature is being enabled on this workspace, read and accept the Service Specific Terms that appear.

  3. If you use an AWS instance profile to connect to AWS data sources other than your workspace’s root S3 bucket, confirm that your AWS instance profile and its associated IAM role are set up correctly. See (Optional) Set up an AWS instance profile to use with your Serverless SQL endpoints.

  4. If you need to specify an AWS Glue metastore or add additional data source configurations, update the Data Access Configuration field. See Data access configuration.

    Serverless SQL endpoints support the default Databricks metastore and AWS Glue as a metastore, but do not support external Hive metastores.

(Optional) Set up an AWS instance profile to use with your Serverless SQL endpoints

If you use an AWS instance profile to connect to AWS data sources other than your workspace’s root S3 bucket, you may need to update your AWS instance profile and its associated IAM role. For more information about using instance profiles, see Secure access to S3 buckets using instance profiles. See also Configure an instance profile.

  1. As a Databricks workspace administrator, go to the SQL admin console in Databricks SQL.

    If you are in the Data Science & Engineering or Databricks Machine Learning workspace environment, you may need to select SQL from the sidebar. Click the icon below the Databricks logo.

    Once you are in Databricks SQL, click User Settings Icon Settings at the bottom of the sidebar and select SQL Admin Console.

    SQL Admin Console
  2. In the SQL admin console, click the SQL Endpoint Settings tab.

  3. Confirm that you are using an instance profile.

    If you are using an instance profile, it will be selected in the Instance Profile field. Make a note of it for the next step.

    If the field value is None, you are not using instance profiles, and you can skip this entire procedure.

  4. In your AWS account, go to the AWS console, select the IAM service, and click the Roles tab.

  5. Select the role for the instance profile that is configured in your SQL endpoint settings.

  6. In the summary area, review the Role ARN and Instance Profile ARNs fields. Confirm that the last part of those two fields have matching names after the final slash. For example:

    enable preview

    If these do not match, contact your Databricks representative.

  7. In the role editor for the role that you want to modify, go to the details page, click the Trust relationships tab, and click Edit trust relationship.

  8. Under Policy Document, add the following statement to your instance profile’s trust policy.

    Replace the sts:ExternalId array with one or more workspace IDs (as shown in the example) for all workspaces that will use Serverless SQL endpoints.

    {
    "Effect": "Allow",
    "Principal": {
      "AWS": [
        "arn:aws:iam::790110701330:role/serverless-customer-resource-role"
      ]
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
          "sts:ExternalId": [
            "databricks-serverless-<YOUR_WORKSPACE_ID1>",
            "databricks-serverless-<YOUR_WORKSPACE_ID2>"
          ]
        }
      }
    }
    

    For example:

    Trust relationship policy
  9. Click Update Trust Policy.

Important

Any time your instance profile changes, you may need to repeat this procedure.