Alteryx

This page describes how to use Alteryx with Databricks.

Requirements

Alteryx 10.6 and above. In-Database processing requires 64-bit Alteryx with 64-bit database drivers.

Get the JDBC connection information for your Databricks cluster

Get your cluster’s server hostname, port, and HTTP path using the instructions in Connecting BI Tools.

Download and configure the Simba Spark ODBC Driver

  1. Download and install the latest Databricks Simba Spark ODBC Driver.

  2. Open the ODBC Admin console that corresponds to the driver type (x32, x64).

    In-Database processing requires 64-bit Alteryx with 64-bit database drivers.

  3. In the User tab, click Add.

  4. Select Simba Spark ODBC Driver.

  5. Enter the following:

  • Data Source Name: Databricks

  • Description: (Optional)

  • Spark Server Type: SparkThriftServer (Spark 1.1 and later)

  • Host(s): the hostname of your Databricks cluster

  • Port: port for your Databricks cluster

  • Authentication: HTTP

  • Mechanism: Token

    Optionally, you can use Username and Password

  • User Name: “token” (literally, the word “token”)

    Optionally, you can use your Databricks username.

  • Password: your Databricks personal access token.

    Optionally, you can use your Databricks password.

  1. Select Save Password (Encrypted).
  2. Select Advanced Options .
  3. Select Fast SQLPrepare .
  4. Select Get Tables With Query .
  5. Select Show System Table.
  6. Click Test to test connection setup.
  7. Click OK to finish setup and exit out of ODBC Admin.

Set up Databricks In-Database connection in Alteryx

In-Database processing requires 64-bit Alteryx with 64-bit database drivers.

  1. In Alteryx Designer, go to In-Database tool tab.

  2. Drag a Connect In-DB tool onto the canvas.

  3. In the Configuration Panel, click the dropdown under Connection Name.

  4. Select Manage Connections...

  5. Enter a Connection Type of User.

  6. Under Connections, click New and enter the following:

    • Connection Name: Databricks (Or other preferred name)
    • Password Encryption: Encrypt for User (or other if preferred)
  7. On the Read tab:

    1. Driver: Spark ODBC
    2. Click the dropdown menu under Connection String.
    3. Select New Database Connection....
    1. Click the Spark Data Source Name dropdown and select Databricks (User).
    2. Click OK.
  8. On the Write tab:

    1. Driver: Databricks Bulk Loader (Avro) or (CSV)
    2. Click the dropdown menu under Connection String.
    3. Select New Databricks Connection....
    1. Under the ODBC Data Source select Databricks (User).
    2. Enter your Databricks User Name and Password.
    3. Databricks URL: https://<your-databricks-hostname>
  9. Click OK.