Skip to main content

Managed connector FAQs

This article answers frequently asked questions about managed connectors in Databricks Lakeflow Connect.

What's the difference between managed connectors, Lakehouse Federation, and Delta Sharing?

Lakehouse Federation allows you to query external data sources without moving your data. Delta Sharing allows you to securely share live data across platforms, clouds, and regions. Databricks recommends ingestion using managed connectors because they scale to accommodate high data volumes, low-latency querying, and third-party API limits. However, you might want to query your data without moving it.

When you have a choice between managed connectors, Lakehouse Federation, and Delta Sharing, choose Delta Sharing for the following scenarios:

  • Limiting data duplication.
  • Querying the freshest possible data.

Choose Lakehouse Federation for the following scenarios:

  • Ad hoc reporting or proof-of-concept work on your ETL pipelines.

What's the difference between managed connectors and Auto Loader?

Managed connectors allow you to incrementally ingest data from SaaS applications like Salesforce and databases like SQL Server. Auto Loader is a cloud object storage connector that allows you to incrementally ingest files as they arrive in S3, ADLS, and GCS. It is compatible with Structured Streaming and DLT but does not offer fully-managed ingestion pipelines.

Can managed connectors write back to third-party apps and databases?

No. If you’re interested in this functionality, reach out to your account team.

What is SCD type 1 vs. type 2?

The Slowly Changing Dimensions (SCD) setting determines how to handle changes in your data over time. Enable SCD type 1 (history tracking off) to overwrite outdated records as they're updated and deleted in the source. Enable SCD type 2 (history tracking on) to maintain a history of those changes. Note that deleting a table or column does not delete that data from the destination, even when SCD type 1 is selected.

Not all connectors support history tracking (SCD type 2).

What is the cost for managed connectors?

Managed connectors have a compute-based pricing model.

SaaS sources like Salesforce and Workday, which run exclusively on serverless infrastructure, incur serverless DLT DBU charges.

For database sources like SQL Server, ingestion gateways can run in classic mode or serverless mode depending on the source, and ingestion pipelines run on serverless. As a result, you can receive both classic and serverless DLT DBU charges.

For rate details, see the DLT pricing page.

Salesforce

Does the Salesforce ingestion connector support Salesforce Data Cloud?

The Salesforce ingestion connector supports Salesforce Sales Cloud. It doesn't support Salesforce Data Cloud, but Lakehouse Federation allows you to query data in Salesforce Data Cloud without moving it. See Run federated queries on Salesforce Data Cloud.

ServiceNow

How does the connector pull data from ServiceNow?

The connector-name connector uses the ServiceNow Table API v2.

Could using the Table API impact the ServiceNow instance?

Yes. However, the impact depends on the amount of data ingested. For example, it is typically more noticeable in the initial snapshot than during an incremental read.

Why is my ServiceNow ingestion performance slow?

Databricks recommends working with your ServiceNow administrator to enable ServiceNow-side indexing on the cursor field. The cursor column is selected from the following list, in order of availability and preference: sys_updated_on (first choice), sys_created_on (second choice), sys_archived (third choice). This is a standard approach for improving performance when ingesting using the ServiceNow APIs. Setting the index allows Databricks to avoid fully scanning the entire sys_updated_on column, which can bottleneck large updates. For instructions, see Create a table index in the ServiceNow documentation. If the issue persists, create a support ticket.

Microsoft SQL Server

How does Databricks connect to SQL Server?

Databricks connects to SQL Server using transport layer security (TLS). Credentials are stored securely inside Unity Catalog and can only be retrieved if the user running the ingestion flow has appropriate permissions. You should create a separate user in SQL Server for ingesting data, and if there are databases or tables you do not want to be available, you can use built-in SQL Server permissions to ensure that the ingestion user does not have access to those entities.

Is this a one-way connection?

Yes. Reverse ETL is not supported.

If the pipeline fails, does ingestion resume without data loss?

Yes. Databricks keeps track of what we’ve extracted from the source and what we’ve applied in the destination. If anything happens, Databricks can resume at that point.