Databricks SQL Driver for Node.js
Experimental
The Databricks SQL Driver for Node.js is provided as-is and is not officially supported by Databricks through customer technical support channels. Support, questions, and feature requests can be communicated through the Issues page of the databricks/databricks-sql-nodejs repo on GitHub. Issues with the use of this code will not be answered or investigated by Databricks Support.
The Databricks SQL Driver for Node.js is a Node.js library that allows you to use JavaScript code to run SQL commands on Databricks compute resources.
Requirements
A development machine running Node.js, version 14 or higher. To print the installed version of Node.js, run the command
node -v
. To install and use different versions of Node.js, you can use tools such as Node Version Manager (nvm).Node Package Manager (
npm
). Later versions of Node.js already includenpm
. To check whethernpm
is installed, run the commandnpm -v
. To installnpm
if needed, you can follow instructions such as the ones at Download and install npm.The @databricks/sql package from npm. To install the
@databricks/sql
package in your Node.js project, usenpm
to run the following command from within the same directory as your project:npm i @databricks/sql
An existing cluster or SQL warehouse.
The Server Hostname and HTTP Path value for the existing cluster or SQL warehouse.
A Databricks personal access token.
Specify the connection variables
To access your cluster or SQL warehouse, the Databricks SQL Driver for Node.js uses connection variables named token
, server_hostname
and http_path
, representing your Databricks personal access token and your cluster’s or SQL warehouse’s Server Hostname and HTTP Path values, respectively.
The Databricks personal access token value for token
is similar to the following: dapi1ab2c34defabc567890123d4efa56789
.
The Server Hostname value for server_hostname
is similar to the following: dbc-a1b2345c-d6e7.cloud.databricks.com
.
The HTTP Path value for http_path
is similar to the following: for a cluster, sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh
; and for a SQL warehouse, /sql/1.0/endpoints/a1b234c5678901d2
.
Note
As a security best practice, you should not hard code these connection variable values into your code. Instead, you should retrieve these connection variable values from a secure location. For example, the code example later in this article uses environment variables.
Query data
The following code example demonstrates how to call the Databricks SQL Driver for Node.js to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the diamonds
table.
The diamonds
table is included in Sample datasets (databricks-datasets). This table is also featured in Get started with Databricks as a data scientist.
This code example retrieves the token
, server_hostname
and http_path
connection variable values from a set of environment variables. These environment variables have the following environment variable names:
DATABRICKS_TOKEN
, which represents your Databricks personal access token from the requirements.DATABRICKS_SERVER_HOSTNAME
, which represents the Server Hostname value from the requirements.DATABRICKS_HTTP_PATH
, which represents the HTTP Path value from the requirements.
You can use other approaches to retrieving these connection variable values. Using environment variables is just one approach among many.
const { DBSQLClient } = require('@databricks/sql');
var token = process.env.DATABRICKS_TOKEN;
var server_hostname = process.env.DATABRICKS_SERVER_HOSTNAME;
var http_path = process.env.DATABRICKS_HTTP_PATH;
if (!token || !server_hostname || !http_path) {
throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
"Check the environment variables DATABRICKS_TOKEN, " +
"DATABRICKS_SERVER_HOSTNAME, and DATABRICKS_HTTP_PATH.")
}
const client = new DBSQLClient();
const utils = DBSQLClient.utils;
client.connect(
options = {
token: token,
host: server_hostname,
path: http_path
}).then(
async client => {
const session = await client.openSession();
const queryOperation = await session.executeStatement(
statement = 'SELECT * FROM default.diamonds LIMIT 2',
options = { runAsync: true });
await utils.waitUntilReady(
operation = queryOperation,
progress = false,
callback = () => {});
await utils.fetchAll(
operation = queryOperation
);
await queryOperation.close();
const result = utils.getResult(
operation = queryOperation
).getValue();
console.table(result);
await session.close();
client.close();
}).catch(error => {
console.log(error);
});
Output:
┌─────────┬─────┬────────┬───────────┬───────┬─────────┬────────┬───────┬───────┬────────┬────────┬────────┐
│ (index) │ _c0 │ carat │ cut │ color │ clarity │ depth │ table │ price │ x │ y │ z │
├─────────┼─────┼────────┼───────────┼───────┼─────────┼────────┼───────┼───────┼────────┼────────┼────────┤
│ 0 │ '1' │ '0.23' │ 'Ideal' │ 'E' │ 'SI2' │ '61.5' │ '55' │ '326' │ '3.95' │ '3.98' │ '2.43' │
│ 1 │ '2' │ '0.21' │ 'Premium' │ 'E' │ 'SI1' │ '59.8' │ '61' │ '326' │ '3.89' │ '3.84' │ '2.31' │
└─────────┴─────┴────────┴───────────┴───────┴─────────┴────────┴───────┴───────┴────────┴────────┴────────┘
For additional examples, see the examples folder in the databricks/databricks-sql-nodejs repository on GitHub.
Additional resources
The Databricks SQL Driver for Node.js repository on GitHub
Getting started with the Databricks SQL Driver for Node.js
Troubleshooting the Databricks SQL Driver for Node.js