Troubleshoot Microsoft SharePoint ingestion
This page describes common issues with the Microsoft SharePoint connector in Databricks Lakeflow Connect and how to resolve them.
General pipeline troubleshooting
If a pipeline fails while executing, click the step that failed and confirm whether the error message has sufficient information about the nature of the error.

Check and download the cluster logs from the pipeline details page by clicking Update details in the right-hand pane, then clicking Logs. Scan the logs for errors or exceptions.

Restrict access to SharePoint files
To restrict the SharePoint files that the connector can access, create a dedicated Microsoft Entra ID user with restricted SharePoint permissions and authenticate to SharePoint with that account. Because the connector uses delegated access (U2M OAuth), it acts on behalf of a Microsoft Entra ID user and can only access files that the user has permission to view.
Authentication errors
If you encounter OAuth errors, run the following code to confirm that your refresh token is working as expected:
# Fill in these values
refresh_token = ""
tenant_id = ""
client_id = ""
client_secret = ""
site_id = ""
# Get an access token
import requests
# Token endpoint
token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
scopes = ["Sites.Read.All"]
scope = " ".join(["https://graph.microsoft.com/{}".format(s) for s in scopes])
scope += (" offline_access")
# Parameters for the request
token_params = {
"client_id": client_id,
"client_secret": client_secret,
"grant_type": "refresh_token",
"refresh_token": refresh_token,
"scope": scope
}
# Send a POST request to the token endpoint
response = requests.post(token_url, data=token_params)
response.json()
access_token = response.json().get("access_token")
# You should get an access token here. You can then check if the access token is able to list all the drives in your SharePoint site.
# List all drives
url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drives"
# Authorization header with access token
headers = {
"Authorization": f"Bearer {access_token}",
"Accept": "application/json"
}
# Send a GET request to list files with specific extensions
requests.get(url, headers=headers).json()