Connect to Matillion

Matillion ETL is an ETL/ELT tool built specifically for cloud database platforms including Databricks. Matillion ETL has a modern, browser-based UI, with powerful, push-down ETL/ELT functionality.

You can integrate your Databricks SQL warehouses (formerly Databricks SQL endpoints) and Databricks clusters with Matillion.

Connect to Matillion using Partner Connect

This section describes how to use Partner Connect to simplify the process of connecting an existing SQL warehouse or cluster in your Databricks workspace to Matillion.

Requirements

See the requirements for using Partner Connect.

Steps to connect

To connect to Matillion using Partner Connect, follow the steps in this section.

Tip

If you have an existing Matillion account, Databricks recommends that you connect to Matillion manually. This is because the connection experience in Partner Connect is optimized for new partner accounts.

  1. In the sidebar, click Partner Connect button Partner Connect.

  2. Click the Matillion tile.

    The Email box displays the email address for your Databricks account. Matillion uses this email address to prompt you to either create a new Matillion account or sign in to your existing Matillion account.

  3. Click Connect to Matillion ETL or Sign in.

    A new tab opens in your browser that displays the Matillion Hub.

  4. Complete the on-screen instructions in Matillion to create your 14-day trial Matillion account or to sign in to your existing Matillion account.

    Important

    If an error displays stating that someone from your organization has already created an account with Matillion, contact one of your organization’s administrators and have them add you to your organization’s Matillion account. After they add you, sign in to your existing Matillion account.

  5. Complete the on-screen instructions to provide your job details, then click Continue.

  6. Complete the on-screen instructions to create an organization, then click Continue.

  7. Click the organization you created, then click Add Matillion ETL instance.

  8. Click Continue in AWS.

    The Amazon EC2 console opens.

  9. Follow Launching Matillion ETL using Amazon Machine Image in the Matillion ETL documentation, starting with step 5. Then follow Accessing Matillion ETL on Amazon Web Services (EC2) in the Matillion ETL documentation.

  10. Follow the instructions in the Matillion ETL documentation.

    Matillion ETL opens in your browser, and the Create Project dialog box displays.

  11. Follow Create a Delta Lake on Databricks project in the Matillion documentation.

    For the settings in the Delta Lake Connection section within these instructions, enter the following information:

    • For Workspace ID, enter the ID of your Databricks workspace. See Workspace instance names, URLs, and IDs.

    • For Username, enter the word token.

    • For Password, enter the value of a Databricks personal access token.

    To get the Workspace ID and generate personal access token, do the following:

    1. Return to the Partner Connect tab in your browser.

    2. Take note of the Workspace ID.

    3. Click Generate a new token.

      A new tab opens in your browser that displays the Settings page of the Databricks UI.

    4. Click Generate new token.

    5. Optionally enter a description (comment) and expiration period.

    6. Click Generate.

    7. Copy the generated personal access token and store it in a secure location.

    8. Return to the Matillion tab in your browser.

    For the settings in the Delta Lake Defaults section within these instructions, for Cluster, choose the name of the SQL warehouse or cluster.

  12. Continue with Next steps.

Connect to Matillion manually

This section describes how to connect an existing SQL warehouse or cluster in your Databricks workspace to Matillion manually.

Note

You can connect to Matillion using Partner Connect to simplify the experience.

Requirements

Before you integrate with Matillion manually, you must have the following:

Steps to connect

To connect to Matillion manually, do the following:

  1. Get the name of the existing compute resource that you want to use (a SQL warehouse or cluster) within your workspace. Later, you will choose that name to complete the connection between your compute resource and your Matillion ETL instance.

    • To view SQL warehouses in your workspace, click Endpoints Icon SQL Warehouses in the sidebar. To create a new SQL warehouse, see Create a SQL warehouse.

    • To view the clusters in your workspace, click compute icon Compute in the sidebar. To create a cluster, see Compute configuration reference.

  2. Follow Connect to your Matillion ETL instance and log in to it in the Matillion documentation.

  3. Follow Create a Delta Lake on Databricks project in the Matillion documentation.

    For the settings in the Delta Lake Connection section within these instructions, enter the following information:

    • For Workspace ID, enter the ID of your Databricks workspace. See Workspace instance names, URLs, and IDs.

    • For Username, enter the word token.

    • For Password, enter the Databricks personal access token.

    For the settings in the Delta Lake Defaults section within these instructions, for Cluster, choose the name of the SQL warehouse or cluster.

  4. Continue with Next steps.

Next steps

Explore one or more of the following resources on the Matillion website: