Skip to main content

Salesforce ingestion connector FAQs

This page answers frequently asked questions about the Salesforce ingestion connector in Databricks Lakeflow Connect.

General managed connector FAQs

The answers in Managed connector FAQs apply to all managed connectors in Lakeflow Connect. Keep reading for Salesforce-specific FAQs.

Which Salesforce products does the Salesforce ingestion connector support?

Lakeflow Connect supports ingesting data from the Salesforce products in the following table. Databricks also offers a zero-copy connector in Lakehouse Federation to run federated queries on Salesforce Data Cloud.

Salesforce product

Lakeflow Connect support

Alternative options

Automotive Cloud

check marked yes

B2B Commerce

check marked yes

B2C Commerce Cloud

x mark no

Data Cloud

Data Cloud

check marked yes

Digital Engagement

check marked yes

Education Cloud

check marked yes

Energy and Utilities Cloud

check marked yes

Experience Cloud

check marked yes

Feedback Management

check marked yes

Field Service

check marked yes

Health Cloud

check marked yes

Life Sciences Cloud

check marked yes

Lightning Platform

check marked yes

Loyalty Cloud

check marked yes

Media Cloud

check marked yes

Manufacturing Cloud

check marked yes

Marketing Cloud

x mark no

Data Cloud

Net Zero Cloud

check marked yes

Non-Profit Cloud

check marked yes

Order Management

check marked yes

Platform (standard and custom objects)

check marked yes

Public Sector Solutions

check marked yes

Rebate Management

check marked yes

Retail & Consumer Goods Cloud

check marked yes

Revenue Cloud

check marked yes

Sales Cloud

check marked yes

Salesforce Maps

check marked yes

Salesforce Scheduler

check marked yes

Service Cloud

check marked yes

Which Salesforce connector should I use?

Databricks offers multiple connectors for Salesforce. There are two zero-copy connectors: the Salesforce Data Cloud file sharing connector and the Salesforce Data Cloud query federation connector. These allow you to query data in Salesforce Data Cloud without moving it. There is also a Salesforce ingestion connector that copies data from various Salesforce products, including Salesforce Data Cloud and Salesforce Sales Cloud.

The following table summarizes the differences between the Salesforce connectors in Databricks:

Connector

Use case

Supported Salesforce products

Salesforce Data Cloud file sharing

When you use the Salesforce Data Cloud file sharing connector in Lakehouse Federation, Databricks calls Salesforce Delivery-as-a-Service (DaaS) APIs to read data in the underlying cloud object storage location directly. Queries are run on Databricks compute without using the JDBC protocol.

Compared to query federation, file sharing is ideal for federating a large amount of data. It offers improved performance for reading files from multiple data sources and better pushdown capabilities. See Lakehouse Federation for Salesforce Data Cloud File Sharing.

Salesforce Data Cloud

Salesforce Data Cloud query federation

When you use the Salesforce Data Cloud query federation connector in Lakehouse Federation, Databricks uses JDBC to connect to source data and pushes queries down into Salesforce. See Run federated queries on Salesforce Data Cloud.

Salesforce Data Cloud

Salesforce ingestion

The Salesforce ingestion connector in Lakeflow Connect allows you to create fully-managed ingestion pipelines from Salesforce Platform data, including data in Salesforce Data Cloud and Salesforce Sales Cloud. This connector maximizes value by leveraging not only CDP data but also CRM data in the Data Intelligence Platform. See Ingest data from Salesforce.

Salesforce Data Cloud, Salesforce Sales Cloud, and more. For a comprehensive list of supported Salesforce products, see the FAQ Which Salesforce products does the Salesforce ingestion connector support? on this page.

Which Salesforce APIs does the ingestion connector use?

The connector uses both Salesforce Bulk API 2.0 and Salesforce REST API v63. For each pipeline update, the connector chooses the API based on how much data it must ingest. The goal is to limit load on the Salesforce APIs. For a larger amount of data (for example, the initial load of a typical object or the incremental load of a very active object), the connector typically uses Bulk API. For a smaller amount of data (for example, the incremental load of a typical object or the initial load of a very small object), the connector typically uses REST API.

How does Databricks connect to Salesforce?

Databricks connects to the Salesforce APIs using HTTPS. Credentials are stored securely in Unity Catalog and can only be retrieved if the user running the ingestion flow has the appropriate permissions. You can optionally create a separate user inside of Salesforce for ingesting data. If there are particular objects or columns that you want to restrict access to, you can use the built-in Salesforce permissions to ensure that the ingestion user doesn't have access to those entities.

How many Salesforce objects can be ingested in one pipeline?

Databricks recommends limiting one Salesforce pipeline to 250 tables. If you need to ingest more objects, create multiple pipelines.

Is there a limit on the number of attributes per object?

No.

How does the connector incrementally pull updates?

The connector selects the cursor column from the following list, in order of preference: SystemModstamp, LastModifiedDate, CreatedDate, and LoginTime. For example, if SystemModstamp is unavailable, then it looks for LastModifiedDate. Objects that don't have any of these columns can't be ingested incrementally. Formula fields can't be ingested incrementally.

Why does the number of updates match the number of rows—even on incremental pipeline runs?

The connector fully downloads formula fields during each pipeline update. In parallel, it incrementally reads non-formula fields. Finally, it combines them into one table.

How does the connector handle retries?

The connector automatically retries on failure, with exponential backoff. It waits 1 second before retrying again, then 2 seconds, then 4 seconds, and so on. Eventually, it stops retrying until the next run of the pipeline. You can monitor this activity in the pipeline usage logs, and you can set up notifications for fatal failures.

How does the connector handle Delta-incompatible data types?

Lakeflow Connect automatically transforms Salesforce data types to Delta-compatible data types. See Salesforce ingestion connector reference.

Does the connector support real-time ingestion?

No. If you're interested in this functionality, reach out to your account team.

How does the connector handle soft deletes?

Soft deletes are handled the same way as inserts and updates.

If your table has history tracking turned off: When a row is soft-deleted from Salesforce, it is deleted from the bronze table at the next sync of the data. For example, suppose you have a pipeline running hourly. If you sync at 12:00 PM, then have a deleted record at 12:30 PM, the deletion won't be reflected until the 1:00 PM sync occurs.

If your table has history tracking turned on: The connector marks the original row as inactive by populating the __END_AT column.

There is one edge case: If the records were deleted and then purged from Salesforce's recycling bin before the pipeline's next update. In this case, Databricks misses the deletes; you must full refresh the destination table to reflect them.

Note that some Salesforce objects, like the history object, do not support soft deletes.

How does the connector handle hard deletes?

Hard deletes are not supported automatically; you must full refresh the destination table to reflect them.