Skip to main content

Salesforce ingestion connector limitations

This article lists limitations and considerations for ingesting data from Salesforce using Databricks Lakeflow Connect.

General SaaS connector limitations

The limitations in this section apply to all SaaS connectors in Lakeflow Connect.

  • When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
  • When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with DLT behavior.
  • During source maintenance periods, Databricks might not be able to access your data.
  • If a source table name conflicts with an existing destination table name, the pipeline update fails.
  • Multi-destination pipeline support is API-only.
  • You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
  • Column-level selection and deselection are API-only.
  • If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
  • Managed ingestion pipelines aren't supported for the following:
    • Workspaces in AWS GovCloud regions
    • Workspaces in Azure GovCloud regions
    • FedRAMP-compliant workspaces

Connector-specific limitations

The limitations in this section are specific to the Salesforce ingestion connector.

Authentication

  • Salesforce allows you to rotate a refresh token, but the connector doesn't support this.

Data types

  • Salesforce data of type NUMBER and CURRENCY loses three digits of precision when ingested. These values can have 18 digits before the decimal point in Salesforce, but they only have 15 digits before the decimal point in Databricks.

Pipelines

  • There is a maximum of 250 objects per pipeline. However, there is no limit on the number of rows or columns that are supported within these objects.
  • base64, address, location, or complexValue types are not supported. These columns are automatically dropped during ingestion.
  • Databricks can ingest formula fields. However, Databricks requires a full snapshot of these fields. This means that pipeline latency depends on whether your Salesforce data includes formula fields and the volume of updates in your Salesforce data.
  • Databricks runs formula fields at the same cadence as the rest of the pipeline. However, within the cadence of your pipeline updates, the non-formula fields might be updated earlier than the formula fields.

Schema evolution

  • Databricks treats soft deletions the same as inserts and updates. When a row is deleted from Salesforce, it is deleted from the bronze table at the next sync of the data. For example, suppose you have a pipeline running hourly. If you sync at 12:00 PM, then have a deleted record at 12:30 PM, the deletion won't be reflected until the 1:00 PM sync occurs.

    There is one edge case: If the pipeline didn't run after the records were deleted but before they were purged from Salesforce's recycling bin, Databricks misses those deletes. The only way to recover from this is with a full refresh.

  • Databricks doesn't support hard deletions automatically. You must fully refresh the destination table to reflect these.

  • SCD type 2 is not supported.

Tables

The following is a non-exhaustive list of unsupported Salesforce objects:

  • Objects with WHERE clauses or LIMIT restrictions:

    • Announcement
    • AppTabMember
    • CollaborationGroupRecord
    • ColorDefinition
    • ContentDocumentLink
    • ContentFolderItem
    • ContentFolderMember
    • DataStatistics
    • DatacloudDandBCompany
    • EntityParticle
    • FieldDefinition
    • FieldHistoryArchive
    • FlexQueueItem
    • FlowVariableView
    • FlowVersionView
    • IconDefinition
    • IdeaComment
    • NetworkUserHistoryRecent
    • OwnerChangeOptionInfo
    • PicklistValueInfo
    • PlatformAction
    • RelationshipDomain
    • RelationshipInfo
    • SearchLayout
    • SiteDetail
    • TaskWhoRelation
    • UserEntityAccess
    • UserFieldAccess
    • Vote
  • Objects for real-time event monitoring:

    • ApiEvent
    • BulkApiResultEventStore
    • EmbeddedServiceDetail
    • EmbeddedServiceLabel
    • FormulaFunction
    • FormulaFunctionAllowedType
    • FormulaFunctionCategory
    • IdentityProviderEventStore
    • IdentityVerificationEvent
    • LightningUriEvent
    • ListViewEvent
    • LoginAsEvent
    • LoginEvent
    • LogoutEvent
    • Publisher
    • RecordActionHistory
    • ReportEvent
    • TabDefinition
    • UriEvent
  • Objects ending with __b, __x, or __hd:

    • ActivityMetric
    • ActivityMetricRollup
    • Site