Connect to Databricks from Microsoft Excel
This article describes how to use the Databricks ODBC driver to connect Databricks to Microsoft Excel. After you establish the connection, you can access the data in Databricks from Excel. You can also use Excel to further analyze the data.
Before you begin
Create a Databricks cluster and associate data with your cluster. See Run your first ETL workload on Databricks.
Install and configure the ODBC driver (Windows | MacOS | Linux). This sets up a Data Source Name (DSN) configuration that you can use to connect Databricks to Microsoft Excel.
Databricks personal access token.
Install Microsoft Excel. You can use a trial version.
Connect from Microsoft Excel
This section describes how to pull data from Databricks into Microsoft Excel using the DSN you created in the prerequisites.
Steps to connect using OAuth 2.0
Note
The steps in this section were tested using Excel for Microsoft 365 for Windows Server 2022 Datacenter 64 bit.
The following are the steps to allow a user to connect to Databricks in a single sign-on experience.
Launch ODBC Data Sources.
Go to System DSN tab and select Simba Spark entry in the DSN list (or you can also create a new DSN by following the instructions hosted on Microsoft site).
Click on Configure button and you will see the below pop-up window.
On Mechanism, select OAuth 2.0.
Click on OAuth Options button, you will see the following OAuth Option pop-up window.
Select Browser Based Authorization Code and Uncheck IGNORE_SQLDRIVER_NOPROMPT. Close the pop-up window.
Click on HTTP Options button and enter the HTTP path in the pop-up window.
Close the HTTP Options pop-up window. Click on Advanced Options button, then click on Server Side Properties button in the pop-up window.
Add a server side property Auth_Flow and value 2.
Now you have successfully configured an ODBC DSN.
Launch Microsoft Excel and create a new blank workbook. Select menu Data > Get Data > From Other Sources > From ODBC
Select the DSN you just configured.
Click the OK button to connect. You will be prompted to authenticate yourself on a browser pop-up window.
Connect using a connection URL with OAuth 2.0
Note
The steps in this section were tested using Excel for Microsoft 365 for Windows version 11.
You can also directly connect Excel to Databricks using a connection URL. The connection URL is in the following format:
Driver=Simba Spark ODBC Driver;Host=<hostName>;Port=443;HttpPath=<httpPath>;SSL=1;AuthMech=11;Auth_Flow=2;Catalog=samples;Schema=default
Launch Excel and select menu Data > Get Data > From Other Sources > From ODBC
In Data source name (DSN), select Simba Spark
Click and expand Advanced options section.
Enter the above connection URL in the text box of Connection string (non-credential properties)(optional)
Click the OK button.
Click the Connect button. You should be prompted by a dialog that you can use to authenticate yourself through OAuth 2.0.
Steps to connect using Databricks personal access token
Note
The steps in this section were tested using Excel for Microsoft 365 for Mac version 16.70.
Open a blank workbook in Microsoft Excel.
In the Data ribbon, click the down caret next to Get Data (Power Query), then click From database (Microsoft Query).
In the iODBC Data Source Chooser, select the DSN that you created in the prerequisites, and then click OK.
For Username, enter
token
.For Password, enter your personal access token from the prerequisites.
In the Microsoft Query dialog, select the Databricks table that you want to load into Excel, and then click Return Data.
In the Import Data dialog, select Table and Existing sheet, and then click Import.
After you load your data into your Excel workbook, you can perform analytical operations on it.