Connect to managed ingestion sources
This article describes how to create connections in Catalog Explorer that store authentication details for Lakeflow Connect managed ingestion sources. Any user with USE CONNECTION
privileges or ALL PRIVILEGES
on the connection can then create managed ingestion pipelines from sources like Salesforce and SQL Server.
An admin user must complete the steps in this article if the users who will create pipelines:
- are non-admin users.
- will use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles.
These interfaces require that users specify an existing connection when they create a pipeline.
Alternatively, admin users can create a connection and a pipeline at the same time in the data ingestion UI. See Managed connectors in Lakeflow Connect.
Lakeflow Connect vs. Lakehouse Federation
Lakehouse Federation allows you to query external data sources without moving your data. When you have a choice between Lakeflow Connect and Lakehouse Federation, choose Lakehouse Federation for ad hoc reporting or proof-of-concept work on your ETL pipelines. See What is Lakehouse Federation?.
Privilege requirements
The user privileges required to connect to a managed ingestion source depend on the interface you choose:
-
Data ingestion UI
Admin users can create a connection and a pipeline at the same time. This end-to-end ingestion wizard is only available in the UI. Not all managed ingestion connectors support UI-based pipeline authoring.
-
Catalog Explorer
Using Catalog Explorer separates connection creation from pipeline creation. This allows admins to create connections for non-admin users to create pipelines with.
If the users who will create pipelines are non-admin users or plan to use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles, an admin must first create the connection in Catalog Explorer. These interfaces require that users specify an existing connection when they create a pipeline.
Scenario | Supported interfaces | Required user privileges |
---|---|---|
An admin user creates a connection and an ingestion pipeline at the same time. | Data ingestion UI |
|
An admin user creates a connection for non-admin users to create pipelines with. | Admin:
Non-admin:
| Admin:
Non-admin:
|
Google Analytics Raw Data
To create a Google Analytics Raw Data connection in Catalog Explorer, do the following:
- In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
- On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
- In the Connection type drop-down menu, select Google Analytics Raw Data.
- (Optional) Add a comment.
- Click Next.
- In the
service_account_json
field, paste the service account JSON details that you downloaded from BigQuery in the source setup. - Click Create connection.
Salesforce
Lakeflow Connect supports ingesting data from the Salesforce Platform. Databricks also offers a zero-copy connector in Lakehouse Federation to run federated queries on Salesforce Data Cloud.
To create a Salesforce ingestion connection in Catalog Explorer, do the following:
-
In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
-
On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
-
In the Connection type drop-down menu, select Salesforce.
-
(Optional) Add a comment.
-
Click Next.
-
If you're ingesting from a Salesforce sandbox account, set Is sandbox to
true
. -
Click Sign in with Salesforce.
You're redirected to Salesforce.
-
If you're ingesting from a Salesforce sandbox, click Use Custom Domain, provide the sandbox URL, and then click Continue.
-
Enter your Salesforce credentials and click Log in. Databricks recommends logging in as a Salesforce user that's dedicated to Databricks ingestion.
importantFor security purposes, only authenticate if you clicked an OAuth 2.0 link in the Databricks UI.
-
After returning to the ingestion wizard, click Create connection.
ServiceNow
-
Configure OAuth. For instructions, see Configure ServiceNow for Databricks ingestion.
-
In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
-
On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
-
In the Connection type drop-down menu, select ServiceNow.
-
(Optional) Add a comment.
-
Click Next.
-
On the Authentication page, enter the following:
- Instance ID: ServiceNow instance ID.
- OAuth scope: Leave the default value
useraccount
. - Client secret: The client secret that you obtained in the source setup.
- Client ID: The client ID that you obtained in the source setup.
-
Click Sign in with ServiceNow.
-
Sign in using your ServiceNow credentials.
You're redirected to the Databricks workspace.
-
Click Create connection.
SharePoint
The steps to create a SharePoint connection in Catalog Explorer depend on the OAuth method you choose. The following methods are supported:
- User-to-machine (U2M) authentication
- Manual token refresh authentication
Databricks recommends using U2M because it doesn't require computing the refresh token yourself. This is handled for you automatically. It also simplifies the process of granting the Entra ID client access to your SharePoint files and is more secure.
U2M (recommended)
-
Complete the source setup. You'll use the authentication details that you obtain to create the connection.
-
In the Databricks workspace, click Catalog > External data > Connections > Create connection.
-
On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
-
In the Connection type drop-down menu, select Microsoft SharePoint.
-
In the Auth type drop-down menu, select OAuth.
-
(Optional) Add a comment.
-
Click Next.
-
On the Authentication page, enter the following credentials for your Microsoft Entra ID app:
- OAuth scope: Leave the OAuth scope set to the pre-filled value.
- Client secret: The client secret that you retrieved in the source setup.
- Client ID: The client ID that you retrieved in the source setup.
- Domain: The SharePoint instance URL in the following format:
https://MYINSTANCE.sharepoint.com
- Tenant ID: The tenant ID that you retrieved in the source setup.
-
Click Sign in with Microsoft SharePoint.
A new window opens. After you sign in with your SharePoint credentials, the permissions you’re granting to the Entra ID app are shown.
-
Click Accept.
A Successfully authorized message displays, and you’re redirected to the Databricks workspace.
-
Click Create connection.
Manual refresh token
-
Complete the source setup. You'll use the authentication details that you obtain to create the connection.
-
In the Databricks workspace, click Catalog > External data > Connections > Create connection.
-
On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
-
In the Connection type drop-down menu, select Microsoft SharePoint.
-
In the Auth type drop-down menu, select OAuth Refresh Token.
-
(Optional) Add a comment.
-
Click Next.
-
On the Authentication page, enter the following credentials for your Microsoft Entra ID app:
- Tenant ID: The tenant ID that you retrieved in the source setup.
- Client ID: The client ID that you retrieved in the source setup.
- Client secret: The client secret that you retrieved in the source setup.
- Refresh token: The refresh token that you retrieved in the source setup.
-
Click Create connection.
SQL Server
To create a Microsoft SQL Server connection in Catalog Explorer, do the following:
- In the Databricks workspace, click Catalog > External Data > Connections.
- Click Create connection.
- Enter a unique Connection name.
- For Connection type select SQL Server.
- For Host, specify the SQL Server domain name.
- For User and Password, enter your SQL Server login credentials.
- Click Create.
Workday Reports
To create a Workday Reports connection in Catalog Explorer, do the following:
- Create Workday access credentials. For instructions, see Configure Workday reports for ingestion.
- In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
- For Connection name, enter a unique name for the Workday connection.
- For Connection type, select Workday Reports.
- For Auth type, select OAuth Refresh Token.
- Enter the Client ID, Client secret, and Refresh token that you obtained in the source setup.
- On the Create Connection page, click Create.
Next step
After you create a connection to your managed ingestion source in Catalog Explorer, any user with USE CONNECTION
privileges or ALL PRIVILEGES
on the connection can create an ingestion pipeline in the following ways:
- Ingestion wizard (supported connectors only)
- Databricks Asset Bundles
- Databricks APIs
- Databricks SDKs
- Databricks CLI
For instructions to create a pipeline, see the managed connector documentation.