Fivetran integration

Preview

This feature is in Public Preview.

The Fivetran integration with Databricks helps you easily centralize data from disparate data sources into Delta Lake.

Here are the steps for using Fivetran with Databricks.

Step 1: Generate a Databricks personal access token

Fivetran authenticates with Databricks using a Databricks personal access token. To generate a personal access token, follow the instructions in Generate a personal access token.

Step 2: Set up a cluster to support integration needs

Fivetran will write data to an S3 bucket and the Databricks integration cluster will read data from that location. Therefore the integration cluster requires secure access to the S3 bucket.

Secure access to an S3 bucket

To access AWS resources, you can launch the Databricks integration cluster with an instance profile. The instance profile should have access to the staging S3 bucket and the target S3 bucket where you want to write the Delta tables. To create an instance profile and configure the integration cluster to use the role, follow the instructions in Secure access to S3 buckets using instance profiles.

As an alternative, you can use IAM credential passthrough, which enables user-specific access to S3 data from a shared cluster.

Specify the cluster configuration

  1. In the Cluster Mode drop-down, select Standard.

  2. In the Databricks Runtime Version drop-down, select Runtime: 6.3 or above.

  3. Turn on Auto Optimize by adding the following properties to your Spark configuration:

    spark.databricks.delta.optimizeWrite.enabled true
    spark.databricks.delta.autoCompact.enabled true
    
  4. Configure your cluster depending on your integration and scaling needs.

For cluster configuration details, see Cluster configurations.

See Server hostname, port, HTTP path, and JDBC URL for the steps to obtain the JDBC URL and HTTP Path.

Step 3: Obtain JDBC and ODBC connection details to connect to a cluster

To connect a Databricks cluster to Fivetran you need the following JDBC/ODBC connection properties:

  • JDBC URL
  • HTTP Path

Step 4: Configure Fivetran with Databricks

Go to the Fivetran login page and follow the instructions.