Skip to main content

Configure OAuth M2M for SharePoint ingestion

Beta

The SharePoint connector is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

Preview

M2M OAuth for SharePoint is in Public Preview.

Learn how to configure OAuth machine-to-machine (M2M) authentication for SharePoint ingestion into Databricks.

Which permission model should I use?

M2M authentication supports the following permission models in Microsoft Azure:

  • Sites.Read.All: Grants access to all SharePoint sites in your organization. This option has fewer setup steps but provides broader access.
  • Sites.Selected: Grants access only to specific SharePoint sites. This requires additional configuration but follows the principle of least privilege.

Databricks recommends using Sites.Selected when possible to limit the service principal's access to only the sites you need to ingest.

Prerequisites

  • Admin privileges in your Microsoft Entra ID tenant.

Sites.Read.All permissions

This option grants the service principal access to all SharePoint sites in your organization.

Step 1: Get the SharePoint site ID

  1. Visit the desired SharePoint site in your browser.
  2. Append /_api/site/id to the URL.
  3. Type Enter.

Step 2: Get SharePoint drive names (optional)

If you want to ingest all drives and documents in your SharePoint site, skip this step. If you only want to ingest a subset of the drives, you must collect their names.

The drive names are listed in the left-hand menu. There's a default drive called Documents in each site. However, your organization might have additional drives. For example, the drives in the following screenshot include doclib1, subsite1doclib1, and more.

View SharePoint drives

Some drives might be hidden from the list. The drive creator can configure this in the drive settings. In this case, hidden drives might be visible in the Site contents section.

View hidden SharePoint drives

Step 3: Create a Microsoft Entra ID application

This step creates an application registration that can access the SharePoint files using a service principal.

  1. In the Microsoft Azure portal (https://portal.azure.com), click Microsoft Entra ID. You might have to search for "Microsoft Entra ID".

    Azure portal: Entra ID card

  2. In the left sidebar, under the Manage section, click App Registrations.

  3. Click New registration.

    New registration button for Entra ID app

  4. In the Register an application form, specify the following:

    • A name for your application (for example, "Databricks SharePoint Ingestion").
    • Whether you want other tenants to access this application.

    You don't need to specify a redirect URL for M2M authentication.

  5. Click Register. You're redirected to the app details page.

  6. Make a note of the following values:

    • Application (client) ID
    • Directory (tenant) ID
  7. Click Client credentials : Add a certificate or secret.

  8. Click + New client secret.

    + New client secret button

  9. Add a description.

  10. Click Add. The updated list of client secrets displays.

  11. Copy the client secret value and store it securely. After you leave the page, you can't access the client secret.

Step 4: Configure API permissions

Grant the application the necessary permissions to read SharePoint files.

  1. In the app registration page, click API permissions in the left-hand menu.

  2. Click + Add a permission.

  3. In the Request API permissions panel, click Microsoft Graph.

  4. Click Application permissions.

  5. Search for and select the following permissions:

    • Sites.Read.All
    • Files.Read.All
  6. Click Add permissions.

  7. Click Grant admin consent for [your organization].

  8. Click Yes to confirm.

    The permissions list shows a green checkmark in the Status column indicating admin consent has been granted.

Step 5: Create a connection in Databricks

  1. In Catalog Explorer, click External data in the left-hand menu.

  2. Click Create connection.

  3. In the Create connection dialog, specify the following:

    • Connection name: A unique name for your connection
    • Connection type: Microsoft SharePoint
    • Authentication type: OAuth Machine to Machine
    • Client ID: The Application (client) ID from Step 3
    • Client secret: The client secret value from Step 3
    • Domain: Your SharePoint domain in the format https://YOURINSTANCE.sharepoint.com
    • Tenant ID: The Directory (tenant) ID from Step 3
  4. Click Create.

Sites.Selected permissions

This option restricts the service principal's access to specific SharePoint sites only.

Steps 1-3: Complete the basic setup

Follow Steps 1-3 in the Sites.Read.All section. These steps are the same for both permission models.

  1. Get the SharePoint site ID.
  2. Get SharePoint drive names (optional).
  3. Create a Microsoft Entra ID application.

Step 4: Configure API permissions for Sites.Selected

Grant the application restricted permissions that require additional site-specific authorization.

  1. In the app registration page, click API permissions in the left-hand menu.

  2. Click + Add a permission.

  3. In the Request API permissions panel, click Microsoft Graph.

  4. Click Application permissions.

  5. Search for and select Sites.Selected.

  6. Click Add permissions.

  7. Click Grant admin consent for [your organization].

    This step requires admin privileges in your Microsoft Entra ID tenant.

  8. Click Yes to confirm.

    The permissions list shows a green checkmark in the Status column indicating admin consent has been granted.

Step 4b: Grant site-specific permissions

After configuring Sites.Selected in Azure, you must explicitly grant the application access to specific SharePoint sites. You can do this using either Microsoft Graph Explorer or a Python notebook.

  1. Go to Microsoft Graph Explorer.

  2. Sign in with an account that has admin permissions for your SharePoint site.

  3. Click Modify permissions and consent to the required permissions (Sites.FullControl.All).

  4. Change the HTTP method to POST.

  5. Enter the following URL, replacing {site_id} with your SharePoint site ID from Step 1:

    https://graph.microsoft.com/v1.0/sites/{site_id}/permissions
  6. In the Request body section, paste the following JSON, replacing the placeholder values:

    JSON
    {
    "roles": ["read"],
    "grantedToIdentities": [
    {
    "application": {
    "id": "<YOUR_CLIENT_ID>",
    "displayName": "<YOUR_APP_NAME>"
    }
    }
    ]
    }

    Replace:

    • <YOUR_CLIENT_ID>: The Application (client) ID from Step 3
    • <YOUR_APP_NAME>: The name of your application registration
  7. Click Run query.

    A successful response indicates the permission has been granted.

Step 5: Create a connection in Databricks

Follow Step 5 from the Sites.Read.All section to create the connection in Databricks using the credentials from your application registration.

Next steps

  1. Create a Microsoft SharePoint ingestion pipeline
  2. Common pipeline maintenance tasks