DBeaver integration with Databricks
Note
This article covers DBeaver, which is neither provided nor supported by Databricks. To contact the provider, see use the Issues page of the dbeaver/dbeaver repo on GitHub.
DBeaver is a local, multi-platform database tool for developers, database administrators, data analysts, data engineers, and others who need to work with databases. DBeaver supports Databricks as well as other popular databases.
This article describes how to use your local development machine to install, configure, and use the free, open source DBeaver Community Edition (CE) to work with databases in Databricks.
Note
This article was tested with macOS, Databricks JDBC Driver version 2.6.25, and DBeaver CE version 22.1.0.
Requirements
Before you install DBeaver, your local development machine must meet the following requirements:
A Linux 64-bit, macOS, or Windows 64-bit operating system. (Linux 32-bit is supported but not recommended.)
The Databricks JDBC Driver onto your local development machine, extracting the
DatabricksJDBC42.jar
file from the downloadedDatabricksJDBC42-<version>.zip
file.
You must also have a Databricks cluster or SQL warehouse to connect with DBeaver.
Step 1: Install DBeaver
Download and install DBeaver CE as follows:
Linux: Download and run one of the Linux installers from the Download page on the DBeaver website.
snap
andflatpak
installation options are provided on this page as well.macOS: Use Homebrew to run
brew install --cask dbeaver-community
, or use MacPorts to runsudo port install dbeaver-community
. A macOS installer is also available from the Download page on the DBeaver website.Windows: Use Chocolatey to run
choco install dbeaver
. A Windows installer is also available from the Download page on the DBeaver website.
Step 2: Configure the Databricks JDBC Driver for DBeaver
Set up DBeaver with information about the Databricks JDBC Driver that you downloaded earlier.
Start DBeaver.
If you are prompted to create a new database, click No.
If you are prompted to connect to or select a database, click Cancel.
Click Database > Driver Manager.
In the Driver Manager dialog box, click New.
In the Create new driver dialog box, click the Libraries tab.
Click Add File.
Browse to the
DatabricksJDBC42.jar
file that you extracted earlier and click Open.Click Find Class.
In the Driver class list, confirm that com.databricks.client.jdbc.Driver is selected.
On the Settings tab, for Driver Name, enter
Databricks
.On the Settings tab, for Class Name, enter
com.databricks.client.jdbc.Driver
.Click OK.
In the Driver Manager dialog box, click Close.
Step 3: Connect DBeaver to your Databricks databases
Use DBeaver to connect to the cluster or SQL warehouse to access the databases in your Databricks workspace.
In DBeaver, click Database > New Database Connection.
In the Connect to a database dialog box, on the All tab, click Databricks, and then click Next.
Click the Main tab and enter a value in the JDBC URL field for your Databricks resource:
Find the JDBC URL value on the JDBC/ODBC tab in the Advanced Options area for your cluster. The JDBC URL should look similar to the following example:
jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/1234567890123456/1234-567890-reef123;AuthMech=3;UID=token;PWD=<personal-access-token>
Important
If the JDBC URL starts with
jdbc:spark:
, you must change it tojdbc:databricks:
or else you will get a connection error later.Replace
<personal-access-token>
with your personal access token for the Databricks workspace.Check Save password locally.
Tip
If you do not want to store your personal access token on your local development machine, omit
UID=token;PWD=<personal-access-token>
from the JDBC URL and uncheck Save password locally. You will be prompted for your Username (the wordtoken
) and Password (your personal access token) each time you try to connect.Find the JDBC URL value on the Connection Details tab for your SQL warehouse. The JDBC URL should look similar to the following example:
jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/warehouses/a123456bcde7f890;
Important
If the JDBC URL starts with
jdbc:spark:
, you must change it tojdbc:databricks:
or else you will get a connection error later.For Username, enter the word
token
.For Password, enter your personal access token.
Check Save password locally.
Tip
If you do not want to store your personal access token on your local development machine, leave Username and Password blank and uncheck Save password locally. You will be prompted for your Username (the word
token
) and Password (your personal access token) each time you try to connect.Click Test Connection.
Tip
You should start your Databricks resource before testing your connection. Otherwise the test might take several minutes to complete while the resource starts.
If the connection succeeds, in the Connection Test dialog box, click OK.
In the Connect to a database dialog box, click Finish.
In the Database Navigator window, a Databricks entry is displayed. To change the connection’s name to make it easier to identify:
Right-click Databricks, and then click Edit Connection.
In the Connection configuration dialog box, click General.
For Connection name, replace
Databricks
with a different name for the connection.Click OK.
Repeat the instructions in this step for each resource that you want DBeaver to access.
Step 4: Use DBeaver to browse data objects
Use DBeaver to access data objects in your Databricks workspace such as tables and table properties, views, indexes, data types, and other data object types.
In DBeaver, in the Database Navigator window, right-click the connection that you want to use.
If Connect is enabled, click it. (If Connect is disabled, you are already connected.)
Tip
You should start your resource before trying to connect to it. Otherwise the connection might take several minutes to complete while the resource starts.
Expand the connection that you just connected to.
Expand and browse available data objects. Double-click a data object to get more information about it.
Repeat the instructions in this step to access additional data objects.
Step 5: Use DBeaver to run SQL statements
Use DBeaver to load the sample diamonds
table from the Sample datasets into the default
database in your workspace and then query the table. For more information, see Create a table. If you do not want to load a sample table, skip ahead to Next steps.
In DBeaver, in the Database Navigator window, right-click the connection that you want to use.
If Connect is enabled, click it. (If Connect is disabled, you are already connected.)
Tip
You should start your resource before trying to connect to it. Otherwise the connection might take several minutes to complete while the resource starts.
Click SQL Editor > New SQL Script.
On the (connection-name) Script-1 tab, enter these SQL statements, which deletes a table named
diamonds
if it exists, and then creates a table nameddiamonds
based on the contents of the CSV file in the Databricks File System (DBFS) mount point:DROP TABLE IF EXISTS diamonds; CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true");
Click SQL Editor > Execute SQL Statement.
In the Database Navigator window, expand the default database and click Refresh.
Expand Tables, and then double-click diamonds.
Within the diamonds tab, click the Data tab to see the table’s data.
To delete the diamonds
table:
Click SQL Editor > New SQL Script.
On the (connection-name) Script-2 tab, enter this SQL statement, which deletes the
diamonds
table.DROP TABLE IF EXISTS diamonds;
On the SQL Editor menu, click Execute SQL Statement.
In the Database Navigator window, right-click the default database and then click Refresh. The
diamonds
table disappears from the list of tables.
Next steps
Use the Database object editor to work with database object properties, data, and entity relation diagrams.
Use the Data editor to view and edit data in a database table or view.
Use the SQL editor to work with SQL scripts.
Work with entity relation diagrams (ERDs) in DBeaver.
Import and export data into and from DBeaver.
Migrate data using DBeaver.
Troubleshoot JDBC driver issues with DBeaver.