Alteryx

This article describes how to use Alteryx with Databricks.

Step 1: Download and install software

Download and install the following:

Step 2: Get Databricks connection information

  1. Get a personal access token.
  2. Get your cluster’s server hostname, port, and HTTP path, using the instructions in Server hostname, port, HTTP path, and JDBC URL.

Step 3: Configure the Simba Spark ODBC driver

  1. Open the ODBC Admin console that corresponds to the driver type.
  2. In the User tab, click Add.
  3. Select Simba Spark ODBC Driver.
  4. Enter the following:
    • Data Source Name: Databricks
    • Description: (optional)
    • Spark Server Type: SparkThriftServer
    • Host(s): host from Step 2.
    • Port: port from Step 2.
    • Authentication: HTTP
    • Mechanism: Token
    • User Name: token
    • Password: personal access token from Step 2.
  5. Select Save Password (Encrypted).
  6. Select Advanced Options.
  7. Select Fast SQLPrepare.
  8. Select Get Tables With Query.
  9. Select Show System Table.
  10. Click Test to test connection setup.
  11. Click OK.

Step 4: Configure connection in Alteryx to a Databricks cluster

  1. In Alteryx Designer, go to the In-Database tool tab.
  2. Drag a Connect In-DB tool onto the canvas.
  3. In the Configuration Panel, click the drop-down menu under Connection Name.
  4. Select Manage Connections…
  5. Enter a Connection Type of User.
  6. Under Connections, click New and enter the following:
    • Connection Name: Databricks (or other preferred name)
    • Password Encryption: Encrypt for User (or other if preferred)
  7. Click the Read tab and enter the following:
    1. Driver: Spark ODBC
    2. Click the drop-down menu under Connection String.
    3. Select New Database Connection….
      1. Click the Spark Data Source Name drop-down and select Databricks (User).
      2. Click OK.
  8. Click the Write tab and enter the following:
    1. Driver: Databricks Bulk Loader (Avro) or (CSV)
    2. Click the drop-down menu under Connection String.
    3. Select New Databricks Connection….
    4. Under the ODBC Data Source select Databricks (User).
      • In the Username field, enter token.
      • In the Password field, enter your personal access token from Step 2.
      • In Databricks URL, enter https:// + the host from Step 2.
  9. Click OK.