Fivetran automated data integration adapts as schemas and APIs change, ensuring reliable data access and simplified analysis with ready-to-query schemas.
You can integrate your Databricks SQL endpoints and Databricks clusters with Fivetran. The Fivetran integration with Databricks helps you easily centralize data from disparate data sources into Delta Lake.
Partner Connect does not integrate Fivetran with Databricks clusters. To integrate a cluster with Fivetran, skip ahead to Connect to Fivetran manually.
For an overview of the following procedure, watch this YouTube video (3 minutes).
Make sure your Databricks account, workspace, and the signed-in user all meet the requirements for Partner Connect.
In the sidebar, click Partner Connect.
Click the Fivetran tile.
If the Fivetran tile has a check mark icon inside of it, this means one of your administrators has already used Partner Connect to connect Fivetran to your workspace. Contact that admin, who can add you to the Fivetran account that they created by using Partner Connect. After they add you, click the Fivetran tile.
If the Connect to partner dialog shows a Next button, click it.
Partner Connect creates the following resources in your workspace:
- A SQL endpoint named FIVETRAN_ENDPOINT by default. (You can change this default name before you click Next.)
- A Databricks service principal named FIVETRAN_USER.
For Email, enter the email address that you want Fivetran to use to create a 14-day trial Fivetran account for you, or enter the email address for your existing Fivetran account.
Click Connect to Fivetran or Sign in.
If an error displays stating that someone from your organization has already created an account with Fivetran, do one of the following:
- Enter an email address that is not associated with your organization, and then click Connect to Fivetran or Sign in again.
- Contact one of your organization’s administrators and have them add you to your organization’s Fivetran account. After they add you, click Connect to Fivetran or Sign in again.
If you clicked Connect to Fivetran, Partner Connect creates a Databricks personal access token and associates it with the FIVETRAN_USER service principal.
A new tab opens in your web browser, which displays the Fivetran website.
Complete the on-screen instructions in Fivetran to create your 14-day trial Fivetran account or to sign in to your existing Fivetran account.
Continue with Next steps.
To complete this series of steps, you get the connection details for an existing compute resource (a SQL endpoint or cluster) in your workspace and then add those details to your Fivetran account.
To connect a SQL endpoint with Fivetran faster, use Partner Connect.
For a SQL endpoint, generate a Databricks personal access token and then:
- To get the connection details for an existing SQL endpoint, see Get connection details for a SQL endpoint. Specifically, you will need the SQL endpoint’s Server Hostname and HTTP Path field values.
- To view the available SQL endpoints in your workspace, see View SQL endpoints.
- To create a SQL endpoint in your workspace, see Create a SQL endpoint.
If the Fivetran tile in Partner Connect in your workspace has a check mark icon inside of it, you can get the connection details for the connected SQL endpoint by clicking the tile and then expanding Connection details. Note however that the Personal access token here is hidden; you must create a replacement personal access token and enter that new token instead when Fivetran asks you for it.
For a cluster, generate a Databricks personal access token and then:
- To get the connection details for an existing cluster, see Get connection details for a cluster. Specifically, you will need the cluster’s Server Hostname and HTTP Path field values.
- To view the available clusters in your workspace, see Display clusters.
- To create a cluster in your workspace, see Create a cluster.
For an overview of the following procedure, watch this YouTube video (2 minutes).
Sign in to your Fivetran account, or create a new Fivetran account, at https://fivetran.com/dashboard.
If you sign in to your organization’s Fivetran account, a Choose Destination page may display, listing one or more existing destination entries with the Databricks logo. These entries might contain connection details for compute resources in workspaces that are separate from yours. If you still want to reuse one of these connections, and you trust the compute resource and have access to it, choose that destination and then skip ahead to Next steps. Otherwise, choose any available destination to get past this page, and then go to https://fivetran.com/account.
In your Dashboard page in Fivetran, click the Destinations tab. (If the Dashboard page is not displayed, go to https://fivetran.com/account.)
Click Add Destination.
Enter a Destination name and click Add.
On the Fivetran is modern ELT page, click Set up a connector.
Click a data source, and then click Next.
Follow the on-screen instructions in the Setup Guide in Fivetran to finish setting up the connector.
Click Save & Test.
After the test succeeds, click Continue.
On the Select your data’s destination page, click Databricks on AWS.
Click Continue Setup.
Complete the on-screen instructions in Fivetran to enter the connection details for your existing Databricks compute resource, specifically the Server Hostname and HTTP Path field values, and the token that you generated earlier.
Click Save & Test.
After the test succeeeds, click Continue.
Continue with Next steps.