JDBC/ODBC drivers and configuration parameters

You can connect business intelligence (BI) tools to Databricks clusters to query data in tables. Every Databricks cluster runs a JDBC/ODBC server on the driver node. This article describes how to get JDBC/ODBC drivers and configuration parameters to connect to Databricks clusters. For tool-specific connection instructions, see Business intelligence tools.

Cluster requirements

To access a cluster using JDBC/ODBC you must have Can Attach To permission.

If you connect to a terminated cluster using JDBC/ODBC and have Can Restart permission, the cluster will be restarted.

Step 1: Download and install a JDBC/ODBC driver

For all BI tools, you need a JDBC or ODBC driver to make a connection to Databricks clusters.

  1. Go to the Databricks JDBC / ODBC Driver Download page.
  2. Fill out the form and submit it. The page will update with links to multiple download options.
  3. Select a driver and download it.
  4. Install the driver. For JDBC, a JAR is provided which does not require installation. For ODBC, an installation package is provided for your chosen platform that must be installed on your system.

Step 2: Gather JDBC/ODBC connection parameters

To configure a JDBC/ODBC driver, you must collect connection parameters from Databricks. Here are some of the parameters a JDBC/ODBC driver might require:

Parameters Value
Username and password See Username and password.
Host, port, HTTP path, JDBC URL See Server hostname, port, HTTP path, and JDBC URL.

The following are usually specified in the httpPath for JDBC and the DSN conf for ODBC:

Parameters Value
Spark Server Type Spark Thrift Server
Schema/Database default
Authentication Mechanism (AuthMech) Username and password authentication
Thrift Transport http
SSL 1

The following is for performance. Ask your vendor to change the parameter if you can’t access it:

Parameters Value
(Batch) Fetch Size 100000

Username and password

If SSO is disabled, you can also use your Databricks username and password to authenticate to the cluster gateway. If SSO is enabled, you must use a personal access token.

Server hostname, port, HTTP path, and JDBC URL

  1. On the cluster configuration page, click the Advanced Options toggle.

  2. Click the JDBC/ODBC tab. It contains the hostname, port, protocol, HTTP path, and JDBC URL.

    JDBC-ODBC tab
  3. Copy the parameters required by your BI tool.

  4. If your tool requires the JDBC URL, replace <personal-access-token> with the token you created in Username and password. For example:

    jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=dapideadbeefdeadbeefdeadbeefdeadbeef
    
  5. ODBC and JDBC drivers accept SQL queries in ANSI SQL-92 dialect and translate the queries to Spark SQL. If your application generates Spark SQL directly or your application uses any non-ANSI SQL-92 standard SQL syntax specific to Databricks Runtime, Databricks recommends that you add ;UseNativeQuery=1 to the connection configuration. With that setting, drivers pass the SQL queries verbatim to Databricks Runtime.

ODBC Data Source Name configuration for the Simba ODBC driver

The Data Source Name (DSN) configuration contains the parameters for communicating with a specific database. BI tools like Tableau usually provide a user interface for entering these parameters. If you have to install and manage the Simba ODBC driver yourself, you might need to create the configuration files and also allow your Driver Manager (ODBC Data Source Administrator on Windows and unixODBC/iODBC on Unix) to access them. Create two files: /etc/odbc.ini and /etc/odbcinst.ini.

/etc/odbc.ini

The content of /etc/odbc.ini can be:

[Databricks-Spark]
Driver=Simba
Server=<server-hostname>
HOST=<server-hostname>
PORT=<port>
SparkServerType=3
Schema=default
ThriftTransport=2
SSL=1
AuthMech=3
UID=token
PWD=<personal-access-token>
HTTPPath=<http-path>

Set <personal-access-token> to the token you retrieved in Username and password and the other parameters to the ones you retrieved in Server hostname, port, HTTP path, and JDBC URL.

/etc/odbcinst.ini

The content of /etc/odbcinst.ini can be:

[ODBC Drivers]
Simba = Installed
[Simba]
Driver = <driver-path>

Set <driver-path> according to the operating system you chose when you downloaded the driver in Step 1:

  • MacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylib
  • Linux /opt/simba/sparkodbc/lib/universal/libsimbasparkodbc.dylib

Configure paths

Specify the paths of the two files in environment variables so that they can be used by the Driver Manager:

export ODBCINI=/etc/odbc.ini
export ODBCSYSINI=/etc/odbcinst.ini
export SIMBASPARKINI=<simba-ini-path>/simba.sparkodbc.ini # (Contains the configuration for debugging the Simba driver)

where <simba-ini-path> is

  • MacOS /Library/simba/spark/lib
  • Linux /opt/simba/sparkodbc/lib/universal