Managed connector FAQs
This article answers frequently asked questions about managed connectors in Databricks Lakeflow Connect. For connector-specific FAQs, see the documentation for your connector.
Which managed connectors does Databricks support?
Lakeflow Connect offers managed connectors for Salesforce, SQL Server, ServiceNow, and Google Analytics. To inform the roadmap or gain early access to Private Preview connectors, reach out to your account team.
Which interfaces do managed connectors support?
All managed connectors support pipeline creation using Databricks APIs and DABs. Most connectors offer pipeline creation in the UI, too.
The following table summarizes which interfaces are supported by each connector.
Connector | UI-based pipeline authoring | API-based pipeline authoring | DABs |
---|---|---|---|
Salesforce | |||
Workday | |||
SQL Server | |||
ServiceNow | |||
GA4 |
How do managed connectors handle schema evolution?
All managed connectors automatically handle new and deleted columns unless you opt out by explicitly specifying the columns that you'd like to ingest.
- When a new column appears in the source, Databricks automatically ingests it on the next run of your pipeline. For any row in the column that appeared prior to the schema change, Databricks leaves the value empty. However, you can opt out of automated column ingestion by listing specific columns to ingest via the API or disabling any future columns in the UI.
- When a column is deleted from the source, Databricks doesn't delete it automatically. Instead, the connector uses a table property to set the deleted column to “inactive” in the destination. If another column later appears that has the same name, then the pipeline fails. In this case, you can trigger a full refresh of the table or manually drop the inactive column.
Similarly, connectors can handle new and deleted tables. If you ingest an entire schema, then Databricks automatically ingests any new tables, unless you opt out. And if a table is deleted in the source, the connector sets it to inactive
in the destination. Note that if you do choose to ingest an entire schema, you should review the limitations on the number of tables per pipeline for your connector.
Additional schema changes depend on the source. For example, the Salesforce connector treats column renames as column deletions and additions and automatically makes the change, with the behavior outlined above. However, the SQL Server connector requires a full refresh of the affected tables to continue ingestion.
The following table summarizes which schema changes can be handled automatically by each connector:
Connector | New and deleted columns | Data type changes | Column renames | New tables |
---|---|---|---|---|
Salesforce |
|
| ||
Workday |
| Not applicable | ||
SQL Server |
|
| ||
ServiceNow |
|
| ||
GA4 |
|
|
Can I customize managed connectors?
You can choose the ingested objects, destination, schedule, permissions, notifications, and more. You can't customize the ingestion process itself because these connectors are fully managed. For further customization, you can use DLT or Structured Streaming.
What's the difference between managed connectors, Lakehouse Federation, and Delta Sharing?
Lakehouse Federation allows you to query external data sources without moving your data. Delta Sharing allows you to securely share live data across platforms, clouds, and regions.
When you have a choice between managed connectors, Lakehouse Federation, and Delta Sharing, choose Delta Sharing for the following scenarios:
- Limiting data duplication.
- Querying the freshest possible data.
Choose Lakehouse Federation for the following scenarios:
- Ad hoc reporting or proof-of-concept work on your ETL pipelines.
What's the difference between managed connectors and Auto Loader?
Managed connectors allow you to incrementally ingest data from SaaS applications like Salesforce and databases like SQL Server. Auto Loader is a cloud object storage connector that allows you to incrementally ingest files as they arrive in S3, ADLS, and GCS. It is compatible with Structured Streaming and DLT but does not offer fully-managed ingestion pipelines.
Can managed connectors write back to the data source?
No. If you're interested in this functionality, reach out to your account team.
Can a pipeline write to multiple destination schemas?
This feature is supported in the Lakeflow Connect API for all managed SaaS connectors, like Salesforce, Workday, and ServiceNow.
If you choose to use this feature, your pipeline will become API-only. You can't edit it in the UI.
Can I change the name of a table that I ingest?
This feature is supported in the Lakeflow Connect API for all managed connectors.
If you choose to use this feature, your pipeline will become API-only. You can't edit API-only pipelines in the UI.
For each table that you'd like to rename, add the destination_table
configuration with your desired table name.
What happens if a pipeline is still running (update N) when the next update is scheduled to run (update N+1)?
Databricks skips update N+1 and picks up on update N+2, assuming that update N has completed in time.
What happens to the destination tables when an ingestion pipeline is deleted?
The destination tables are dropped when the ingestion pipeline is deleted.
How are managed connectors priced?
Managed connectors have a compute-based pricing model.
SaaS sources like Salesforce and Workday, which run exclusively on serverless infrastructure, incur serverless DLT DBU charges.
For database sources like SQL Server, ingestion gateways can run in classic mode or serverless mode depending on the source, and ingestion pipelines run on serverless. As a result, you can receive both classic and serverless DLT DBU charges.
For rate details, see the DLT pricing page.
Connector-specific FAQs
For connector-specific FAQs, see the documentation for your connector: