Microsoft SharePoint connector limitations

Preview

The Microsoft SharePoint connector is in Beta.

This page lists limitations and considerations for ingestion from Microsoft SharePoint using Databricks Lakeflow Connect.

General SaaS connector limitations

The limitations in this section apply to all SaaS connectors in Lakeflow Connect.

When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with Lakeflow Spark Declarative Pipelines behavior.
During source maintenance periods, Databricks might not be able to access your data.
If a source table name conflicts with an existing destination table name, the pipeline update fails.
Multi-destination pipeline support is API-only.
You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
Column-level selection and deselection are API-only.
If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
Databricks can't ingest two or more tables with the same name in the same pipeline, even if they come from different source schemas.
The source system assumes that the cursor columns are monotonically increasing.
With SCD type 1 enabled, deletes don't produce an explicit delete event in the change data feed. For auditable deletions, use SCD type 2 if the connector supports it. For details, see Example: SCD type 1 and SCD type 2 processing with CDF source data.

The SharePoint connector only supports files that are 100 MB or smaller. The metadata for files that are larger than 100 MB will be ingested, but the file content will not be downloaded.
Ingesting file-level access control lists (ACLs) and other custom metadata from SharePoint is not supported.
Ingesting files that are linked to a different SharePoint document library is not supported.
Individual file selection and deselection within a drive are not supported. The connector ingests all of the files in a drive.
The utils provided for downstream usage are limited to single-user clusters. However, single-user clusters can't access streaming tables created by other users. Therefore, each downstream user must create their own ingestion pipeline.

You can modify the utils to make them work on serverless and shared clusters, but this can impact performance. See Examples.
Some fields (for example, quickXorHash, mimeType) are not supported for all file formats. Even in these cases, file download and other metadata ingestion should work.
Databricks recommends ingesting at most once hourly.
The connector is API-only. The Databricks UI isn't supported.
App-only access (M2M OAuth) is not supported. The connector only supports delegated access (U2M OAuth).