Databricks SQL Driver for Node.js

The Databricks SQL Driver for Node.js is a Node.js library that allows you to use JavaScript code to run SQL commands on Databricks compute resources.

Requirements

Specify the connection variables

To access your cluster or SQL warehouse, the Databricks SQL Driver for Node.js uses connection variables named token, server_hostname and http_path, representing your Databricks personal access token and your cluster’s or SQL warehouse’s Server Hostname and HTTP Path values, respectively.

The Databricks personal access token value for token is similar to the following: dapi1ab2c34defabc567890123d4efa56789.

The Server Hostname value for server_hostname is similar to the following: dbc-a1b2345c-d6e7.cloud.databricks.com.

The HTTP Path value for http_path is similar to the following: for a cluster, sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh; and for a SQL warehouse, /sql/1.0/endpoints/a1b234c5678901d2.

Note

As a security best practice, you should not hard code these connection variable values into your code. Instead, you should retrieve these connection variable values from a secure location. For example, the code example later in this article uses environment variables.

Query data

The following code example demonstrates how to call the Databricks SQL Driver for Node.js to run a basic SQL query on a Databricks compute resource. This command returns the first two rows from the diamonds table.

The diamonds table is included in Sample datasets. This table is also featured in Tutorial: Query data with notebooks.

This code example retrieves the token, server_hostname and http_path connection variable values from a set of Databricks environment variables. These environment variables have the following environment variable names:

  • DATABRICKS_TOKEN, which represents your Databricks personal access token from the requirements.

  • DATABRICKS_SERVER_HOSTNAME, which represents the Server Hostname value from the requirements.

  • DATABRICKS_HTTP_PATH, which represents the HTTP Path value from the requirements.

You can use other approaches to retrieving these connection variable values. Using environment variables is just one approach among many.

The following code example demonstrates how to call the Databricks SQL Connector for Node.js to run a basic SQL command on a cluster or SQL warehouse. This command returns the first two rows from the diamonds table.

const { DBSQLClient } = require('@databricks/sql');

var token          = process.env.DATABRICKS_TOKEN;
var serverHostname = process.env.DATABRICKS_SERVER_HOSTNAME;
var httpPath       = process.env.DATABRICKS_HTTP_PATH;

if (!token || !serverHostname || !httpPath) {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_TOKEN, " +
                  "DATABRICKS_SERVER_HOSTNAME, and DATABRICKS_HTTP_PATH.");
}

const client = new DBSQLClient();

client.connect(
  options = {
    token: token,
    host:  serverHostname,
    path:  httpPath
  }).then(
    async client => {
      const session = await client.openSession();

      const queryOperation = await session.executeStatement(
        statement = 'SELECT * FROM default.diamonds LIMIT 2',
        options   = {
          runAsync: true,
          maxRows: 10000 // This option enables the direct results feature.
        }
      );

      const result = await queryOperation.fetchAll({
        progress: false,
        callback: () => {},
      });

      await queryOperation.close();

      console.table(result);

      await session.close();
      await client.close();
}).catch((error) => {
  console.log(error);
});
import { DBSQLClient } from '@databricks/sql';
import IDBSQLSession from '@databricks/sql/dist/contracts/IDBSQLSession';
import IOperation from '@databricks/sql/dist/contracts/IOperation';

var serverHostname: string = process.env.DATABRICKS_SERVER_HOSTNAME || '';
var httpPath: string       = process.env.DATABRICKS_HTTP_PATH || '';
var token: string          = process.env.DATABRICKS_TOKEN || '';

if (serverHostname == '' || httpPath == '' || token == '') {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_SERVER_HOSTNAME, " +
                  "DATABRICKS_HTTP_PATH, and DATABRICKS_TOKEN.");
}

const client: DBSQLClient = new DBSQLClient();

client.connect(
  {
    host:  serverHostname,
    path:  httpPath,
    token: token
  }).then(
    async client => {
      const session: IDBSQLSession = await client.openSession();

      const queryOperation: IOperation = await session.executeStatement(
        'SELECT * FROM default.diamonds LIMIT 2',
        {
          runAsync: true,
          maxRows: 10000 // This option enables the direct results feature.
        }
      );

      const result = await queryOperation.fetchAll({
        progress: false,
        callback: () => {},
      });

      await queryOperation.close();

      console.table(result);

      await session.close();
      client.close();
}).catch((error) => {
  console.log(error);
});

Output:

┌─────────┬─────┬────────┬───────────┬───────┬─────────┬────────┬───────┬───────┬────────┬────────┬────────┐
│ (index) │ _c0 │ carat  │    cut    │ color │ clarity │ depth  │ table │ price │   x    │   y    │   z    │
├─────────┼─────┼────────┼───────────┼───────┼─────────┼────────┼───────┼───────┼────────┼────────┼────────┤
│    0    │ '1' │ '0.23' │  'Ideal'  │  'E'  │  'SI2'  │ '61.5' │ '55'  │ '326' │ '3.95' │ '3.98' │ '2.43' │
│    1    │ '2' │ '0.21' │ 'Premium' │  'E'  │  'SI1'  │ '59.8' │ '61'  │ '326' │ '3.89' │ '3.84' │ '2.31' │
└─────────┴─────┴────────┴───────────┴───────┴─────────┴────────┴───────┴───────┴────────┴────────┴────────┘

Configure Logging

The logger provides information for debugging problems with the connector. All DBSQLClient objects are instantiated with a logger that prints to the console, but by passing in a custom logger, you can send this information to a file. The following example shows how to configure a logger and change its level.

const { DBSQLClient, DBSQLLogger, LogLevel } = require('@databricks/sql');

var token          = process.env.DATABRICKS_TOKEN;
var serverHostname = process.env.DATABRICKS_SERVER_HOSTNAME;
var httpPath       = process.env.DATABRICKS_HTTP_PATH;

if (!token || !serverHostname || !httpPath) {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_TOKEN, " +
                  "DATABRICKS_SERVER_HOSTNAME, and DATABRICKS_HTTP_PATH.");
}

const logger = new DBSQLLogger({ filepath: 'log.txt', level: LogLevel.info });
const client = new DBSQLClient({ logger: logger });

client.connect(
  options = {
    token: token,
    host:  serverHostname,
    path:  httpPath
  }).then(
    async client => {
      const session = await client.openSession();

      let queryOperation = await session.executeStatement('SELECT "Hello, World!"', { runAsync: true });
      const result = await queryOperation.fetchAll();
      await queryOperation.close();

      console.table(result);

      // Set logger to different level.
      //
      logger.setLevel(LogLevel.debug);

      queryOperation = await session.executeStatement('SELECT "Hello, World!"', { runAsync: true });
      result = await queryOperation.fetchAll();

      await queryOperation.close();
      await session.close();
      await client.close();
}).catch((error) => {
  console.log(error);
});
import { DBSQLClient } from '@databricks/sql';
import IDBSQLSession from '@databricks/sql/dist/contracts/IDBSQLSession';
import IOperation from '@databricks/sql/dist/contracts/IOperation';

var serverHostname: string = process.env.DATABRICKS_SERVER_HOSTNAME || '';
var httpPath: string       = process.env.DATABRICKS_HTTP_PATH || '';
var token: string          = process.env.DATABRICKS_TOKEN || '';

if (serverHostname == '' || httpPath == '' || token == '') {
  throw new Error("Cannot find Server Hostname, HTTP Path, or personal access token. " +
                  "Check the environment variables DATABRICKS_SERVER_HOSTNAME, " +
                  "DATABRICKS_HTTP_PATH, and DATABRICKS_TOKEN.");
}

const client: DBSQLClient = new DBSQLClient();

client.connect(
  {
    host:  serverHostname,
    path:  httpPath,
    token: token
  }).then(
    async client => {
      const session: IDBSQLSession = await client.openSession();

      let queryOperation: IOperation = await session.executeStatement('SELECT "Hello, World!"', { runAsync: true });
      const result = await queryOperation.fetchAll();
      await queryOperation.close();

      console.table(result);

      // Set logger to different level.
      //
      logger.setLevel(LogLevel.debug);

      queryOperation = await session.executeStatement('SELECT "Hello, World!"', { runAsync: true });
      result = await queryOperation.fetchAll();

      await queryOperation.close();
      await session.close();
      await client.close();
}).catch((error) => {
  console.log(error);
});

For additional examples, see the examples folder in the databricks/databricks-sql-nodejs repository on GitHub.

Additional resources

API reference

Classes

DBSQLClient class

Main entry point for interacting with a database.

Methods
connect method

Opens a connection to the database.

Parameters

options

Type: ConnectionOptions

The set of options used to connect to the database.

The host, path, and token fields must be populated.

Example:

const client: DBSQLClient = new DBSQLClient();

client.connect(
  {
    host:  serverHostname,
    path:  httpPath,
    token: token
  }
)

Returns: Promise<IDBSQLClient>

openSession method

Opens session between DBSQLClient and database.

Parameters

request

Type: OpenSessionRequest

A set of optional parameters for specifying initial schema and initial catalog

Example:

const session = await client.openSession(
  {initialCatalog: 'catalog'}
);

Returns: Promise<IDBSQLSession>

getClient method

Returns internal thrift TCLIService.Client object. Must be called after DBSQLClient has connected.

No parameters

Returns TCLIService.Client

close method

Closes the connection to the database and releases all associated resources on the server. Any additional calls to this client will throw error.

No parameters.

No return value.

DBSQLSession class

DBSQLSessions are primarily used for the execution of statements against the databbase as well as various metadata fetching operations.

Methods
executeStatement method

Executes a statement with the options provided.

Parameters

statement

Type: str

The statement to be executed.

options

Type: ExecuteStatementOptions

A set of optional parameters for determining query timeout, max rows for direct results, and whether to run the query asynchronously. By default maxRows is set to 10000. If maxRows is set to null, the operation will run with the direct results feature off.

Example:

const session = await client.openSession(
  {initialCatalog: 'catalog'}
);

queryOperation = await session.executeStatement(
  'SELECT "Hello, World!"', { runAsync: true }
);

Returns: Promise<IOperation>

close method

Closes the session. Must be done after using session.

No parameters.

No return value.

getId method

Returns the GUID of the session.

No parameters.

Returns: str

getTypeInfo method

Returns information about supported data types.

Parameters

request

Type: TypeInfoRequest

Request parameters.

Returns: Promise<IOperation>

getCatalogs method

Gets list of catalogs.

Parameters

request

Type: CatalogsRequest

Request parameters.

Returns: Promise<IOperation>

getSchemas method

Gets list of schemas.

Parameters

request

Type: SchemasRequest

Request parameters. Fields catalogName and schemaName can be used for filtering purposes.

Returns: Promise<IOperation>

getTables method

Gets list of tables.

Parameters

request

Type: TablesRequest

Request parameters. Fields catalogName and schemaName and tableName can be used for filtering.

Returns: Promise<IOperation>

getFunctions method

Gets list of tables.

Parameters

request

Type: FunctionsRequest

Request parameters. Field functionName is required.

Returns: Promise<IOperation>

getPrimaryKeys method

Gets list of primary keys.

Parameters

request

Type: PrimaryKeysRequest

Request parameters. Fields schemaName and tableName are required.

Returns: Promise<IOperation>

getCrossReference method

Gets information about foreign keys between two tables.

Parameters

request

Type: CrossReferenceRequest

Request parameters. Schema, Parent, and Catalog name must be specified for both tables.

Returns: Promise<IOperation>

DBSQLOperation class

DBSQLOperations are created by DBSQLSessions and can be used to fetch the results of statements and check up on their execution. Data is fetched through functions fetchChunk and fetchAll.

Methods
getId method

Returns the GUID of the operation.

No parameters.

Returns: str

fetchAll method

Waits for operation completion, then fetches all rows from operation.

Parameters: None

Returns: Promise<Array<object>>

fetchChunk method

Waits for operation completion, then fetches up to a specified number of rows from an operation.

Parameters

options

Type: FetchOptions

Options used to fetch. Currently, the only option is maxRows, which corresponds to the max number of data objects to be returned in any given array.

Returns: Promise<Array<object>>

close method

Closes the operation and releases all associated resources. Must be done after no longer using operation.

No parameters.

No return value.