Quickstart: Run and visualize a query

This quickstart shows you how to query a table of people to find the number of women named Mary grouped by birth year, and visualize the result.

The columns of the table are id, firstName, middleName, lastName, gender, birthDate, ssn, and salary.

This article shows how to use the Databricks SQL UI and the BI tools Power BI and Tableau Online to do the same query.

Requirements

Your Databricks SQL admin must complete the Databricks SQL admin quickstart.

Databricks SQL UI

Your landing page looks like this when you first log in to Databricks SQL:

User landing page

Use the sidebar

You can access all of your Databricks assets using the left sidebar. The sidebar’s contents depend on the selected persona: Data Science & Engineering, Machine Learning, or Databricks SQL.

  • By default, the sidebar appears in a collapsed state and only the icons are visible. Move your cursor over the sidebar to expand to the full view.

  • To change the persona, click the icon immediately below the Databricks logo, and select a persona from the menu.

    change persona
  • To pin a persona so that it appears next time you log in, click pin persona next to the persona. Click it again to remove the pin.

  • Use Menu options at the bottom of the sidebar to set the sidebar mode to Auto (default behavior), Expand, or Collapse.

Log in to Databricks SQL

If you have access to both the Databricks Data Science & Engineering workspace and the Databricks SQL environment, you might need to switch from Data Science & Engineering to Databricks SQL. See Use the sidebar for instructions.

Step 1: Query the people table

  1. Log in to Databricks SQL.

  2. Click Create Icon Create in the sidebar and select Query.

    The query editor displays.

  3. In the box below New Query, click the Down Arrow Icon icon and select QS Endpoint.

    Query editor

    The first time you create a query the list of available SQL endpoints displays in alphabetical order. The next time you create a query, the last used endpoint is selected.

  4. In the box below the endpoint, click the Down Arrow Icon icon and if not selected, select the default database.

    Default database
  5. Paste in a SELECT statement that queries the number of women named Mary:

     SELECT year(birthDate) as birthYear, count(*) AS total
     FROM default.people10m
     WHERE firstName = 'Mary' AND gender = 'F'
     GROUP BY birthYear
     ORDER BY birthYear
    
  6. Press Ctrl/Cmd + Enter or click the Execute button.

    The Limit 1000 checkbox is selected by default to ensure that the query returns at most 1000 rows. If you want more rows, you can unselect the checkbox and specify a LIMIT clause in your query. The query result displays in the Table tab.

    Query result

Step 2: Create a visualization

  1. Click the + Add Visualization tab.

    The visualization editor displays.

    Visualization editor
  2. In the X Column drop-down, select birthYear.

  3. In the Y Column drop-down, select total.

  4. Click the X Axis tab.

  5. In the Name field, enter Birth Year.

  6. Click the Y Axis tab.

  7. In the Name field, enter Number of Marys by Birth Year.

  8. Click Save.

    The saved chart displays in the query editor.

    Marys SQLA chart

Step 3: Create a dashboard and add visualization

  1. Click Create Icon Create in the sidebar and select Dashboard.

  2. Enter People.

  3. Click Save.

  4. Click the Add Widget button at the bottom right.

  5. Select Marys.

  6. Click Add to Dashboard.

  7. Click Done Editing.

    Complete dashboard

BI tools

This section shows how to collect Databricks SQL SQL endpoint connection information, connect Power BI Desktop and Tableau Online to the SQL endpoint, query a table of people, and visualize the data.

Gather SQL endpoint connection information

Step 1: Get SQL endpoint connection details

  1. Log in to Databricks SQL.

  2. Click Endpoints Icon SQL Endpoints in the sidebar.

  3. In the Endpoints list, type QS in the filter box.

  4. Click the QS Endpoint link.

  5. Click the Connection Details tab.

    Connection details
  6. Click Copy Icon to copy the Server Hostname and HTTP Path.

Step 2: Get personal access token

The personal access token authenticates you to Databricks SQL.

  1. Click User Settings Icon at the bottom of the sidebar and select Settings > User Settings.
  2. Click the Personal Access Tokens tab.
  3. Click + Generate New Token.
  4. Optionally enter a comment and modify the token lifetime.
  5. Click Generate.
  6. Click Copy Icon to copy the token and click OK.

Power BI

Step 1: Configure the connection from Power BI to the SQL endpoint

  1. Select Get Data > More… > Azure and select the Azure Databricks connector.

    Databricks connector in Get Data list

    Note

    Although the label says “Azure Databricks”, this connector works with Databricks on AWS.

  2. Click Connect.

  3. Enter the Server Hostname and HTTP Path you copied in Step 1: Get SQL endpoint connection details.

    Power BI data source
  4. Click OK.

  5. At the authentication prompt, select the Personal Access Token tab. Enter the personal access token you copied in Step 2: Get personal access token.

  6. Click Connect.

    If the SQL endpoint is not running, it will be started but make take a few minutes to return.

  7. In the Power BI Navigator, select the default.people10m table.

  8. Click Load.

Step 2: Shape the data

Return the records of women with the first name Mary and compute the number of Marys in each year.

  1. Remove all columns except firstName, gender, and birthDate.
  2. Filter the data with firstName = Mary and gender = F.
  3. Change the type of birthDate to Date and remove all date components except year.
  4. Group by year to count the number of Marys per year.
Marys Power BI table

Step 3: Visualize the data

Marys Power BI chart

Tableau Online

Step 1: Configure the connection from Tableau Online to the SQL endpoint

  1. In Tableau Online, open a workbook or create a new one.

  2. Select Data > New Data Source.

    Tableau Online data source
  3. In the Connect to Data dialog, select Connectors > Databricks.

    Tableau Online Databricks
  4. In the Databricks connection dialog, enter the Server Hostname and HTTP Path you copied in Step 1: Get SQL endpoint connection details.

    Tableau data source
  5. Authenticate using token as the username and the token from Step 2: Get personal access token as the password.

  6. Click Sign in.

    If the SQL endpoint is not running, it will be started but make take a few minutes to return.

  7. In the Schema field, enter default. Press Enter.

  8. In the Select schema drop-down, select default.

  9. In the Table field, enter people10m. Press Enter.

  10. Drag the people10 table to the canvas.

Step 2: Shape the data

Return the records of women with the first name Mary and compute the number of Marys in each year.

  1. Remove all columns except firstName, gender, and birthDate.
  2. Filter the data with firstName = Mary and gender = F.
  3. Change the type of birthDate to Date and remove all date components except year.
  4. Group by year to count the number of Marys per year.
Marys Tableau table

Step 3: Visualize the data

Visualize as a bar chart:

Marys Tableau chart