Skip to main content

Connect to managed ingestion sources

This article describes how to create connections in Catalog Explorer that store authentication details for Lakeflow Connect managed ingestion sources. Any user with USE CONNECTION privileges or ALL PRIVILEGES on the connection can then create managed ingestion pipelines from sources like Salesforce and SQL Server.

An admin user must complete the steps in this article if the users who will create pipelines:

  • are non-admin users.
  • will use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles.

These interfaces require that users specify an existing connection when they create a pipeline.

Alternatively, admin users can create a connection and a pipeline at the same time in the data ingestion UI. See Managed connectors in Lakeflow Connect.

Lakeflow Connect vs. Lakehouse Federation

Lakehouse Federation allows you to query external data sources without moving your data. When you have a choice between Lakeflow Connect and Lakehouse Federation, choose Lakehouse Federation for ad hoc reporting or proof-of-concept work on your ETL pipelines. See What is Lakehouse Federation?.

Privilege requirements

The user privileges required to connect to a managed ingestion source depend on the interface you choose:

  • Data ingestion UI

    Admin users can create a connection and a pipeline at the same time. This end-to-end ingestion wizard is only available in the UI. Not all managed ingestion connectors support UI-based pipeline authoring.

  • Catalog Explorer

    Using Catalog Explorer separates connection creation from pipeline creation. This allows admins to create connections for non-admin users to create pipelines with.

    If the users who will create pipelines are non-admin users or plan to use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles, an admin must first create the connection in Catalog Explorer. These interfaces require that users specify an existing connection when they create a pipeline.

Scenario

Supported interfaces

Required user privileges

An admin user creates a connection and an ingestion pipeline at the same time.

Data ingestion UI

  • CREATE CONNECTION on the metastore
  • USE CATALOG on the target catalog
  • (SaaS apps) USE SCHEMA and CREATE TABLE on an existing schema or CREATE SCHEMA on the target catalog
  • (Databases) USE SCHEMA, CREATE TABLE, and CREATE VOLUME on an existing schema or CREATE SCHEMA on the target catalog

An admin user creates a connection for non-admin users to create pipelines with.

Admin:

  • Catalog Explorer

Non-admin:

  • Data ingestion UI
  • Databricks APIs
  • Databricks SDKs
  • Databricks CLI
  • Databricks Asset Bundles

Admin:

  • CREATE CONNECTION on the metastore

Non-admin:

  • USE CONNECTION or ALL PRIVILEGES on an existing connection.
  • USE CATALOG on the target catalog
  • (SaaS apps) USE SCHEMA and CREATE TABLE on an existing schema or CREATE SCHEMA on the target catalog
  • (Databases) USE SCHEMA, CREATE TABLE, and CREATE VOLUME on an existing schema or CREATE SCHEMA on the target catalog

Google Analytics Raw Data

To create a Google Analytics Raw Data connection in Catalog Explorer, do the following:

  1. In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
  2. On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
  3. In the Connection type drop-down menu, select Google Analytics Raw Data.
  4. (Optional) Add a comment.
  5. Click Next.
  6. In the service_account_json field, paste the service account JSON details that you downloaded from BigQuery in the source setup.
  7. Click Create connection.

Salesforce

Lakeflow Connect supports ingesting data from the Salesforce Platform. Databricks also offers a zero-copy connector in Lakehouse Federation to run federated queries on Salesforce Data Cloud.

To create a Salesforce ingestion connection in Catalog Explorer, do the following:

  1. In the Databricks workspace, click Catalog > External locations > Connections > Create connection.

  2. On the Connection basics page of the Set up connection wizard, specify a unique Connection name.

  3. In the Connection type drop-down menu, select Salesforce.

  4. (Optional) Add a comment.

  5. Click Next.

  6. If you're ingesting from a Salesforce sandbox account, set Is sandbox to true.

  7. Click Sign in with Salesforce.

    You're redirected to Salesforce.

  8. If you're ingesting from a Salesforce sandbox, click Use Custom Domain, provide the sandbox URL, and then click Continue.

    Use custom domain button

    Enter sandbox URL

  9. Enter your Salesforce credentials and click Log in. Databricks recommends logging in as a Salesforce user that's dedicated to Databricks ingestion.

    important

    For security purposes, only authenticate if you clicked an OAuth 2.0 link in the Databricks UI.

  10. After returning to the ingestion wizard, click Create connection.

ServiceNow

  1. Configure OAuth. For instructions, see Configure ServiceNow for Databricks ingestion.

  2. In the Databricks workspace, click Catalog > External locations > Connections > Create connection.

  3. On the Connection basics page of the Set up connection wizard, specify a unique Connection name.

  4. In the Connection type drop-down menu, select ServiceNow.

  5. (Optional) Add a comment.

  6. Click Next.

  7. On the Authentication page, enter the following:

    • Instance ID: ServiceNow instance ID.
    • OAuth scope: Leave the default value useraccount.
    • Client secret: The client secret that you obtained in the source setup.
    • Client ID: The client ID that you obtained in the source setup.
  8. Click Sign in with ServiceNow.

  9. Sign in using your ServiceNow credentials.

    You're redirected to the Databricks workspace.

  10. Click Create connection.

SharePoint

The steps to create a SharePoint connection in Catalog Explorer depend on the OAuth method you choose. The following methods are supported:

  • User-to-machine (U2M) authentication
  • Manual token refresh authentication

Databricks recommends using U2M because it doesn't require computing the refresh token yourself. This is handled for you automatically. It also simplifies the process of granting the Entra ID client access to your SharePoint files and is more secure.

  1. Complete the source setup. You'll use the authentication details that you obtain to create the connection.

  2. In the Databricks workspace, click Catalog > External data > Connections > Create connection.

  3. On the Connection basics page of the Set up connection wizard, specify a unique Connection name.

  4. In the Connection type drop-down menu, select Microsoft SharePoint.

  5. In the Auth type drop-down menu, select OAuth.

  6. (Optional) Add a comment.

  7. Click Next.

  8. On the Authentication page, enter the following credentials for your Microsoft Entra ID app:

    • OAuth scope: Leave the OAuth scope set to the pre-filled value.
    • Client secret: The client secret that you retrieved in the source setup.
    • Client ID: The client ID that you retrieved in the source setup.
    • Domain: The SharePoint instance URL in the following format: https://MYINSTANCE.sharepoint.com
    • Tenant ID: The tenant ID that you retrieved in the source setup.

    Required fields for a Unity Catalog connection that stores SharePoint authentication details

  9. Click Sign in with Microsoft SharePoint.

    A new window opens. After you sign in with your SharePoint credentials, the permissions you’re granting to the Entra ID app are shown.

  10. Click Accept.

    A Successfully authorized message displays, and you’re redirected to the Databricks workspace.

  11. Click Create connection.

Manual refresh token

  1. Complete the source setup. You'll use the authentication details that you obtain to create the connection.

  2. In the Databricks workspace, click Catalog > External data > Connections > Create connection.

  3. On the Connection basics page of the Set up connection wizard, specify a unique Connection name.

  4. In the Connection type drop-down menu, select Microsoft SharePoint.

  5. In the Auth type drop-down menu, select OAuth Refresh Token.

  6. (Optional) Add a comment.

  7. Click Next.

  8. On the Authentication page, enter the following credentials for your Microsoft Entra ID app:

    • Tenant ID: The tenant ID that you retrieved in the source setup.
    • Client ID: The client ID that you retrieved in the source setup.
    • Client secret: The client secret that you retrieved in the source setup.
    • Refresh token: The refresh token that you retrieved in the source setup.

    Required fields for a Unity Catalog connection that stores SharePoint authentication details using a manual refresh token

  9. Click Create connection.

SQL Server

To create a Microsoft SQL Server connection in Catalog Explorer, do the following:

  1. In the Databricks workspace, click Catalog > External Data > Connections.
  2. Click Create connection.
  3. Enter a unique Connection name.
  4. For Connection type select SQL Server.
  5. For Host, specify the SQL Server domain name.
  6. For User and Password, enter your SQL Server login credentials.
  7. Click Create.

Workday Reports

To create a Workday Reports connection in Catalog Explorer, do the following:

  1. Create Workday access credentials. For instructions, see Configure Workday reports for ingestion.
  2. In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
  3. For Connection name, enter a unique name for the Workday connection.
  4. For Connection type, select Workday Reports.
  5. For Auth type, select OAuth Refresh Token.
  6. Enter the Client ID, Client secret, and Refresh token that you obtained in the source setup.
  7. On the Create Connection page, click Create.

Next step

After you create a connection to your managed ingestion source in Catalog Explorer, any user with USE CONNECTION privileges or ALL PRIVILEGES on the connection can create an ingestion pipeline in the following ways:

  • Ingestion wizard (supported connectors only)
  • Databricks Asset Bundles
  • Databricks APIs
  • Databricks SDKs
  • Databricks CLI

For instructions to create a pipeline, see the managed connector documentation.