Fivetran automated data integration adapts as schemas and APIs change, ensuring reliable data access and simplified analysis with ready-to-query schemas.
You can integrate your Databricks SQL warehouses (formerly Databricks SQL endpoints) and Databricks clusters with Fivetran. The Fivetran integration with Databricks helps you centralize data from disparate data sources into Delta Lake.
To connect your Databricks workspace to Fivetran using Partner Connect, see Connect to a data ingestion partner using Partner Connect.
Partner Connect does not integrate Fivetran with Databricks clusters. To integrate a cluster with Fivetran, connect to Fivetran manually.
For an overview of the Partner Connect procedure, watch this YouTube video (3 minutes).
For an overview of the manual connection procedure, watch this YouTube video (2 minutes).
To connect a SQL warehouse with Fivetran faster, use Partner Connect.
Before you connect to Fivetran manually, you must have the following:
A cluster or SQL warehouse in your Databricks workspace.
The connection details for your cluster or SQL warehouse, specifically the Server Hostname, Port, and HTTP Path values.
A Databricks personal access token.
As a security best practice, when authenticating with automated tools, systems, scripts, and apps, Databricks recommends you use access tokens belonging to service principals instead of workspace users. To create access tokens for service principals, see Manage access tokens for a service principal.
If the Fivetran tile in Partner Connect in your workspace has a check mark icon inside of it, you can get the connection details for the connected SQL warehouse by clicking the tile and then expanding Connection details. The Personal access token is hidden; you must create a replacement personal access token and enter that new token instead when Fivetran asks you for it.
To connect to Fivetran manually, do the following:
Sign in to your Fivetran account, or create a new Fivetran account, at https://fivetran.com/dashboard.
If you sign in to your organization’s Fivetran account, a Choose Destination page may display, listing one or more existing destination entries with the Databricks logo. These entries might contain connection details for compute resources in workspaces that are separate from yours. If you still want to reuse one of these connections, and you trust the compute resource and have access to it, choose that destination and then skip ahead to Next steps. Otherwise, choose any available destination to get past this page, and then go to https://fivetran.com/account.
In your Dashboard page in Fivetran, click the Destinations tab. (If the Dashboard page is not displayed, go to https://fivetran.com/account.)
Click Add Destination.
Enter a Destination name and click Add.
On the Fivetran is modern ELT page, click Set up a connector.
Click a data source, and then click Next.
Follow the on-screen instructions in the Setup Guide in Fivetran to finish setting up the connector.
Click Save & Test.
After the test succeeds, click Continue.
On the Select your data’s destination page, click Databricks on AWS.
Click Continue Setup.
Complete the on-screen instructions in Fivetran to enter the connection details for your existing Databricks compute resource, specifically the Server Hostname and HTTP Path field values, and the token that you generated earlier.
Click Save & Test.
After the test succeeeds, click Continue.
Continue with Next steps.