Import and query data using the Databricks Excel Add-in
This feature is in Public Preview.
The Databricks Excel Add-in connects your Databricks workspace to Microsoft Excel, bringing governed Lakehouse data directly into your spreadsheets to help you move from data to decisions faster.
This page describes how to use the Databricks Excel Add-in to import and analyze data from Databricks in Excel. You can browse and import Databricks tables through an intuitive interface where no SQL knowledge is required. While the add-in offers the flexibility to execute custom SQL queries, it is optional.
Prerequisites
Before you use the Excel Add-in, verify that you have it set up.
Select a SQL warehouse
Choose which SQL warehouse to use:
- In the upper-right of the Databricks Add-in pane in Excel, click the drop-down menu.
- Select which SQL warehouse you want to use.
Import data from Databricks
Import data from Databricks in Excel by selecting a table, writing a SQL query, or importing a pivot table.
You can import Unity Catalog metric views using pivot tables, SQL queries, and custom functions.
Create pivot tables
To create a pivot table from Unity Catalog tables and views in Excel:
- In the Databricks Excel Add-in pane, under the New import tab, select Select data as the Import method.
- Under Catalog, select the table you want to create a pivot table from and click Select.
- Select the Pivot Data checkbox.
- Configure your Row, Column, Value, and Filters as needed.
- (Optional) To see a sample of the import, click Preview.
- (Optional) Set a row limit for your import.
- Click Import results.
note
Pivot tables can only be imported to a new sheet.
When working with Unity Catalog metrics in pivot tables, you might see Sum(measure) displayed in the results. This is expected behavior and no additional aggregation occurs. Excel requires that values have an aggregation function, but because the data contains unique values, no aggregation occurs.
Select tables
Data is imported as an Excel table object. You can move the table or rename the sheet, and the Excel Add-in refreshes data in the new location.
To import data from a Databricks table, do the following:
- In the Databricks Excel Add-in pane, under the New import tab, select Select data as the Import method.
- Choose a table to import from the Catalog explorer. You can filter the catalog by owner, certification status, and other properties using
filter.
- Click Select.
- Under Columns, click the caret and deselect the columns you don't want to import, or leave all columns selected to import the entire table.
- (Optional) To set filters, click + next to Filters, select the column that you want to apply a filter to, and then enter your filter condition.
- (Optional) To see a sample of the import, click Preview.
- (Optional) You can set a row limit to restrict the number of rows imported.
- (Optional) To easily identify your imported data, you can enter an Import name.
- Under Output Destination, choose to import the data to a new sheet or the current sheet. If you import to the current sheet, data starts at the selected cell.
- Click Import results.
Write SQL queries
The Write SQL import method supports SQL functions and stored procedures.
To run custom SQL queries against your Databricks workspace, do the following:
- In the Databricks Excel Add-in pane, under the New import tab, select Write SQL as the Import method.
- Enter a name for your query to identify it later.
- Write a new query or use an existing query from your Databricks workspace.
- Write your SQL query in the editor. You can query any table in Unity Catalog that you have permissions to access.
- Click
Catalog explorer to view your schemas and tables.
- Click
- To use a query from your Databricks workspace or an existing query in Excel, click
the folder. If you use an existing query from your Databricks workspace, edits made in Excel are not reflected on Databricks.
- Write your SQL query in the editor. You can query any table in Unity Catalog that you have permissions to access.
- (Optional) To add query parameters, click +Add next to Parameters. Click the parameter to specify the Parameter Name and Parameter Value.
- Under Output Destination, choose to import the data to a new sheet or the current sheet. If you import to the current sheet, data starts at the selected cell.
- To preview your query results, click Run.
- Click Import results.
You can also use custom functions to add query parameters. See Write SQL.
Use custom functions in Excel
The Excel Add-in provides custom functions that you can use in Excel formulas to import data from Databricks.
Select a table
The DATABRICKS.Table function imports data from a Unity Catalog table.
Syntax:
=DATABRICKS.Table(catalog_name.schema_name.table_name, [column1, ...], [limit])
Parameters:
catalog_name.schema_name.table_name(required): The fully qualified table name.columns(optional): An array of column names to import. Omit this parameter to import all columns.limit(optional): The maximum number of rows to import. Omit this parameter to import all rows, up to the 10 MB limit.
Example:
=DATABRICKS.Table("main.default.customers", {"customer_id", "customer_name"}, 100)
This formula imports the customer_id and customer_name columns from the main.default.customers table, limited to 100 rows.
Write SQL
The DATABRICKS.SQL function runs a SQL query that uses query parameters and returns the results.
Syntax:
Specify parameters using values.
=DATABRICKS.SQL("query_text", {parameter1_name, parameter1_value; ...})
Specify parameters using a cell range. The name and value parameters must be defined in cells that are in the same row.
=DATABRICKS.SQL("query_text", {param_name_cell: param_value_cell; ...})
Parameters:
query_text(required): The SQL query to run.parameters(required): A mapping of parameter values to substitute into the query.
Example:
=DATABRICKS.SQL("SELECT * FROM samples.bakehouse.sales_suppliers WHERE longitude > :long_param AND latitude > :lat_param LIMIT 10", {"long_param",20; "lat_param",10})
=DATABRICKS.SQL("SELECT * FROM samples.bakehouse.sales_suppliers WHERE city = :city", M4:N4)
This formula runs a query that filters sales data by longitude and latitude, using the provided parameter values.
Manage queries
Manage your existing imports from the Imports page.
Edit an existing import
To edit an existing import:
- In the Databricks Add-in pane in Excel, click the Imports tab.
- Find the import you want to edit.
- Click the three-dot menu next to the import.
- Click Edit to edit your import.
Refresh data
The Excel Add-in does not automatically refresh imported data. To update your data with the latest values from Databricks:
- To refresh a single import, do the following:
- In the Databricks Add-in pane in Excel, click the Imports tab.
- Click
refresh next to the import you want to refresh.
- To refresh all imports, do the following:
- Click Refresh All in the Databricks Add-in pane.
The Add-in runs the original query or table selection again and updates your worksheet with fresh data.
When refreshing data, the Excel Add-in clears all existing data in the specified table and reloads the newest data from Databricks. Any custom columns you added to the table are deleted during the refresh process.
Sharing implications
When you share an Excel workbook that contains Databricks data, consider the following data access and security implications:
Visibility to imported data
When a recipient refreshes an import, the Add-in uses the recipient's Unity Catalog permissions. If they do not have access to the underlying data, the refresh fails.
For workbooks where data privacy is a concern, you can use the following workaround:
- Create a workbook with all necessary formulas and imports.
- Delete the imported data from the sheet.
- Share the workbook with the recipient.
- Have the recipient refresh the data.
The recipient only sees data they have access to based on their Unity Catalog permissions.
Access to workspaces and data assets
- Users without access to the Unity Catalog objects referenced in the workbook can't refresh the data. To refresh data, users must have read permissions on the underlying tables and views in Unity Catalog.
- Users must have access to the underlying table in Databricks to edit existing imports.
Query visibility
Users with edit access to the workbook can view the queries used to generate the data through the Databricks Add-in, even if they don't have access to the underlying data in Unity Catalog.
Limitations
- Custom functions: For custom functions, query results are limited to 25 MiB due to limitations of the SQL execution API.