Connect to managed ingestion sources
This article describes how to create connections in Catalog Explorer that store authentication details for Lakeflow Connect managed ingestion sources. Any user with USE CONNECTION
privileges or ALL PRIVILEGES
on the connection can then create managed ingestion pipelines from sources like Salesforce and SQL Server.
An admin user must complete the steps in this article if the users who will create pipelines:
- are non-admin users.
- will use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles.
These interfaces require that users specify an existing connection when they create a pipeline.
Alternatively, admin users can create a connection and a pipeline at the same time in the data ingestion UI. See Connectors in Lakeflow Connect.
Lakeflow Connect vs. Lakehouse Federation
Databricks recommends ingestion using Lakeflow Connect because it scales to accommodate high data volumes, low-latency querying, and third-party API limits. However, you might want to query your data without moving it. Lakehouse Federation allows you to query external data sources without moving your data.
When you have a choice between Lakeflow Connect and Lakehouse Federation, choose Lakehouse Federation for ad hoc reporting or proof-of-concept work on your ETL pipelines. See What is Lakehouse Federation?.
Privilege requirements
The user privileges required to connect to a managed ingestion source depend on the interface you choose:
-
Data ingestion UI
Admin users can create a connection and a pipeline at the same time. This end-to-end ingestion wizard is only available in the UI.
-
Catalog Explorer
Using Catalog Explorer separates connection creation from pipeline creation. This allows admins to create connections for non-admin users to create pipelines with.
If the users who will create pipelines are non-admin users or plan to use Databricks APIs, Databricks SDKs, the Databricks CLI, or Databricks Asset Bundles, an admin must first create the connection in Catalog Explorer. These interfaces require that users specify an existing connection when they create a pipeline.
Scenario | Supported interfaces | Required user privileges |
---|---|---|
An admin user creates a connection and an ingestion pipeline at the same time. | Data ingestion UI |
|
An admin user creates a connection for non-admin users to create pipelines with. | Admin:
Non-admin:
| Admin:
Non-admin:
|
Create a connection for managed ingestion in Catalog Explorer
This section provides instructions to create connections to managed ingestion sources in Catalog Explorer.
Salesforce Sales Cloud
Lakeflow Connect supports ingestion from Salesforce Sales Cloud. It does not support Salesforce Data Cloud, but Lakehouse Federation allows you to query data in Salesforce Data Cloud without moving it. See Run federated queries on Salesforce Data Cloud.
To create a Salesforce ingestion connection in Catalog Explorer, do the following:
-
In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
-
On the Connection basics page of the Set up connection wizard, specify a unique Connection name.
-
In the Connection type drop-down menu, select Salesforce.
-
(Optional) Add a comment.
-
Click Next.
-
If you’re ingesting from a Salesforce sandbox account, set Is sandbox to
true
. -
Click Sign in with Salesforce.
You're redirected to Salesforce.
-
If you’re ingesting from a Salesforce sandbox, click Use Custom Domain, provide the sandbox URL, and then click Continue.
-
Enter your Salesforce credentials and click Log in. Databricks recommends logging in as a Salesforce user that’s dedicated to Databricks ingestion.
-
After returning to the ingestion wizard, click Create connection.
Microsoft SQL Server
To create a Microsoft SQL Server connection in Catalog Explorer, do the following:
- In the Databricks workspace, click Catalog > External Data > Connections.
- Click Create connection.
- Enter a unique Connection name.
- For Connection type select SQL Server.
- For Host, specify the SQL Server domain name.
- For User and Password, enter your SQL Server login credentials.
- Click Create.
Workday Reports
To create a Workday Reports connection in Catalog Explorer, do the following:
- Create Workday access credentials. For instructions, see Configure Workday reports for ingestion.
- In the Databricks workspace, click Catalog > External locations > Connections > Create connection.
- For Connection name, enter a unique name for the Workday connection.
- For Connection type, select Workday Reports.
- For Auth type, select OAuth Refresh Token and then enter the Client ID, Client secret, and Refresh token that you created in step 1.
- On the Create Connection page, click Create.
Next step
After you create a connection to your managed ingestion source in Catalog Explorer, any user with USE CONNECTION
privileges or ALL PRIVILEGES
on the connection can create an ingestion pipeline in the following ways:
- Ingestion wizard
- Databricks Asset Bundles
- Databricks APIs
- Databricks SDKs
- Databricks CLI
For instructions to create a pipeline, see the managed connector documentation.