Run and visualize a query in Databricks SQL, PowerBI, and Tableau

This quickstart shows you how to query a table of people to find the number of women named Mary grouped by birth year, and visualize the result.

The columns of the table are id, firstName, middleName, lastName, gender, birthDate, ssn, and salary.

This article shows how to use the Databricks SQL UI and the BI tools Power BI and Tableau Online to do the same query.

Before you begin

Your Databricks SQL admin must complete the Databricks SQL Set up a user to query a table before you complete the steps in the following sections.

Databricks does not automatically create a serverless starter warehouse. Create one now and call it Serverless Starter Warehouse. See Configure SQL warehouses.

Log in to Databricks SQL

You can access all of your Databricks assets using the sidebar. The sidebar’s contents depend on the selected persona: Data Science & Engineering, Machine Learning, or SQL.

  • By default, the sidebar appears in a collapsed state and only the icons are visible. Move your cursor over the sidebar to expand to the full view.

  • To change the persona, click the icon below the Databricks logo Databricks logo, and select a persona.

    change persona
  • To pin a persona so that it appears the next time you log in, click pin persona next to the persona. Click it again to remove the pin.

  • Use Menu options at the bottom of the sidebar to set the sidebar mode to Auto (default behavior), Expand, or Collapse.

If the Databricks Data Science & Engineering or Databricks Machine Learning environment displays, use the sidebar to switch to Databricks SQL. If you do not have access to Databricks SQL, request access from an admin.

When you log in to Databricks SQL your landing page looks like this:

User landing page

Query the people table in Databricks SQL

In Set up a user to query a table, an admin created a table named people10. In this section, you run a query on this table for number of people named Mary for each birth year.

  1. Click New Icon New in the sidebar and select Query.

    The SQL editor displays.

  2. Select a warehouse.

    If your workspace satisfies serverless compute requirements and your account does not require accepting terms of use, choose a serverless SQL warehouse, which is called Serverless Starter Warehouse if you created one based on the instructions above. For more information about account terms of use and instance profile configuration, see Enable serverless SQL warehouses.

    Otherwise, click Starter Warehouse.

    The first time you create a query the list of available SQL warehouses displays in alphabetical order. The next time you create a query, the last used warehouse is selected.

  3. Paste in this SELECT statement that queries the number of women named Mary that are born each year:

     SELECT year(birthDate) as birthYear, count(*) AS total
     FROM default.people10m
     WHERE firstName = 'Mary' AND gender = 'F'
     GROUP BY birthYear
     ORDER BY birthYear
    
  4. Press Ctrl/Cmd + Enter or click Run (1000).

    Limit 1000 is selected by default for all queries to ensure that the query returns at most 1000 rows. If you want to return all rows for this query, you can unselect LIMIT 1000 by clicking the Run (1000) drop-down. If you want to specify a different limit on the number of rows, you can add a LIMIT clause in your query with a value of your choice.

    The query result displays in the Table tab.

    Query result
  5. Click Save to save this query.

  6. In the Save query text box, enter Marys and click Save.

Create a visualization in Databricks SQL

  1. In the results table, click + and then click Visualization.

    The visualization editor displays.

    Visualization editor
  2. Change the visualization name from Scatter 1 to Marys by Birth Year.

  3. In the Visualization Type drop-down, select Bar.

  4. In the X Column drop-down, verify birthYear is selected.

  5. In the Y Column drop-down, verify total is selected.

  6. Click the X Axis tab.

  7. In the Name field, enter Birth Year.

  8. Click the Y Axis tab.

  9. In the Name field, enter Number of Marys by Birth Year.

  10. Click Save.

    The saved chart displays in the SQL editor.

    Marys SQLA chart

Create a dashboard in Databricks SQL

  1. Click New Icon New in the sidebar and select Dashboard.

  2. Enter People as the dashboard name.

  3. Click Save.

  4. In the warehouse drop-down list, select Serverless Starter Warehouse (if serverless is enabled for your workspace, or Starter Warehouse is serverless is not enabled).

  5. In the Add drop-down list, click Visualization.

  6. In the query list, select Marys.

  7. In the Select existing visualization list, select Marys by Birth Year.

  8. In the Title text box, enter Marys by Birth Year.

    Add visualization widget
  9. Click Add to Dashboard.

  10. Click Done Editing.

    Complete dashboard

Get SQL warehouse connection details

  1. Click Endpoints Icon SQL Warehouses in the sidebar.

  2. In the Warehouses list, type Serverless Starter in the filter box (or Starter if serverless is not enabled).

  3. Click the Serverless Starter Warehouse link (or the warehouse of your choice).

  4. Click the Connection Details tab.

    Connection details
  5. Click Copy Icon to copy the Server Hostname and HTTP Path.

Get a personal access token

A personal access token authenticates you to Databricks SQL.

  1. Click your username on the menu bar in the upper right of the workspace and select User Settings from the drop down.

  2. Click the Personal Access Tokens tab.

  3. Click + Generate New Token.

  4. Optionally enter a comment and modify the token lifetime.

  5. Click Generate.

  6. Click Copy Icon to copy the token and click OK.

Query Databricks SQL using Power BI

Configure the connection from Power BI to Databricks SQL

  1. Select Get Data > More… > Azure and select the Azure Databricks connector.

    Databricks connector in Get Data list

    Note

    Although the label says Azure Databricks, the connector also works with Databricks on AWS.

  2. Click Connect.

  3. Enter the Server Hostname and HTTP Path you copied in Get SQL warehouse connection details.

    Power BI data source
  4. Click OK.

  5. At the authentication prompt, select the Personal Access Token tab. Enter the personal access token you copied in Get a personal access token.

  6. Click Connect.

    If the SQL warehouse is not running, it will start but may take a few minutes to become available.

  7. In the Power BI Navigator, select the default.people10m table.

  8. Click Load.

Shape the data

Return the records of women with the first name Mary and compute the number of Marys in each year.

  1. Remove all columns except firstName, gender, and birthDate.

  2. Filter the data with firstName = Mary and gender = F.

  3. Change the type of birthDate to Date and remove all date components except year.

  4. Group by year to count the number of Marys per year.

Marys Power BI table

Visualize the data

Marys Power BI chart

Query Databricks SQL using Tableau Online

Configure the connection from Tableau Online to Databricks SQL

  1. In Tableau Online, open a workbook or create a new one.

  2. Select Data > New Data Source.

    Tableau Online data source
  3. In the Connect to Data dialog, select Connectors > Databricks.

    Tableau Online Databricks
  4. In the Databricks connection dialog, enter the Server Hostname and HTTP Path you copied in Get SQL warehouse connection details.

    Tableau data source
  5. Authenticate using token as the username and the token from Get a personal access token as the password.

  6. Click Sign in.

    If the SQL warehouse is not running, it will start but may take a few minutes to become available.

  7. In the Schema field, enter default. Press Enter.

  8. In the Select schema drop-down, select default.

  9. In the Table field, enter people10m. Press Enter.

  10. Drag the people10 table to the canvas.

Shape the data

Return the records of women with the first name Mary and compute the number of Marys in each year.

  1. Remove all columns except firstName, gender, and birthDate.

  2. Filter the data with firstName = Mary and gender = F.

  3. Change the type of birthDate to Date and remove all date components except year.

  4. Group by year to count the number of Marys per year.

Marys Tableau table

Visualize the data

Visualize as a bar chart:

Marys Tableau chart