Perform the required admin onboarding tasks for Databricks SQL

Preview

This feature is in Public Preview.

For a brief overview of Databricks SQL’s capabilities, with no configuration required, see Quickstart: Learn about Databricks SQL by importing dashboards from the Sample Dashboard Gallery. Before you can start using Databricks SQL with your organization’s own data, an administrator must complete some onboarding tasks. This article guides you through those tasks.

Requirements

Use the sidebar

You can access all of your Databricks assets using the sidebar. The sidebar’s contents depend on the selected persona: Data Science & Engineering, Machine Learning, or SQL.

  • By default, the sidebar appears in a collapsed state and only the icons are visible. Move your cursor over the sidebar to expand to the full view.

  • To change the persona, click the icon below the Databricks logo Databricks logo, and select a persona.

    change persona
  • To pin a persona so that it appears the next time you log in, click pin persona next to the persona. Click it again to remove the pin.

  • Use Menu options at the bottom of the sidebar to set the sidebar mode to Auto (default behavior), Expand, or Collapse.

Step 1: Configure data access

If you store tables in cloud storage, you must provide Databricks SQL with an instance profile so it can access the data. You can then use standard SQL statements in the next step to configure fine-grained access on individual databases, tables, and views.

Complete the instructions to configure data access, then continue to Step 2: Grant permissions.

Note

If you use Databricks managed tables, you do not need to configure access to cloud storage.

When data access is configured correctly, administrators can browse the external tables using the data explorer (Data in the sidebar).

Step 2: Grant permissions

The Databricks SQL security model is based on the well-established security model in SQL databases that allows you to set fine-grained access using standard SQL statements such as GRANT and REVOKE.

In Databricks SQL, you use table access control to set up such fine-grained permissions based on the SQL standard. With table access control, administrators and object owners can define an expressive, cloud-agnostic, and fine-grained security model that provides end-to-end security on your data lake with auditability.

For this step, Databricks strongly recommends that you set permissions using groups synchronized with your identity provider. By default, only administrators can see all data (based on the configuration done in Step 1: Configure data access). You can grant permissions in the data explorer (Data in the sidebar) or by using SQL commands in the query editor. The following instructions use the data explorer.

  1. In the sidebar, click Data.

  2. The data explorer opens and shows the tables in the default database (called default) in the default catalog (called hive_metastore).

    To select a different database, click default and select a database.

    To select a database in a different catalog, click hive_metastore and select a different catalog, then click select database to select a database.

  3. To manage permissions for the entire database, click Permissions. To manage permissions for a table or view, click its name and then click Permisions.

  4. To grant permissions, click Grant.

    1. Enter the user or group.
    2. Select one or more permissions to grant, or select All to grant all permissions.
    3. Click OK.
  5. To revoke a permission, select it from the list, then click Revoke.

Now that you’ve granted users and groups permission to view and interact with data, continue to Step 3: Configure SQL endpoints.

Step 3: Configure SQL endpoints

A SQL endpoint is a computation resource that lets you run SQL commands on data objects within the Databricks environment. A small SQL endpoint called Starter Endpoint has been created to help you get started. To handle more complex workloads, you can easily increase its size (to reduce latency) or the number of underlying clusters (to handle more concurrent users). To reduce costs, the starter endpoint is configured to terminate after 120 minutes idle.

If Serverless compute (Gated Private Preview) is enabled for your workspace and you enable Serverless SQL endpoints, a Serverless SQL endpoint called Serverless Starter Endpoint is also automatically created, and you can use it for dashboards, visualizations, and queries. Serverless SQL endpoints start more quickly than classic SQL endpoints and automatically terminate after 10 minutes idle.

To configure the starter endpoint or create additional SQL endpoints, click SQL Endpoints in the sidebar. To learn more, see SQL endpoints.