Outlook connector reference
This page contains reference documentation for the Outlook connector in Lakeflow Connect.
Connection properties
When you create the Unity Catalog connection, you must specify the following properties. See Configure authentication to Microsoft Outlook for how to obtain these values.
Property | Description |
|---|---|
Client ID | The Application (client) ID from the Microsoft Entra ID app registration. |
Client secret | The client secret value from the Microsoft Entra ID app registration. |
Tenant ID | The Directory (tenant) ID from the Microsoft Entra ID app registration. |
Destination schema
The connector produces a single table, email_messages, under the default schema.
- Primary key:
(mailbox, outlook_message_id) - Incremental sync cursor:
received_at, tracked per mailbox and folder
email_messages
Column | Type | Description |
|---|---|---|
|
| Email address of the mailbox. Part of the primary key. |
|
| Unique message ID from the Microsoft Graph API. Part of the primary key. |
|
| RFC 2822 internet message ID. |
|
| Conversation thread ID. |
|
| Folder display name (for example, |
|
| List of recipient email addresses. |
|
| List of CC recipient email addresses. |
|
| List of BCC recipient email addresses. |
|
| Sender email address. |
|
| Actual sender email address (might differ from |
|
| List of reply-to email addresses. |
|
| Email subject line. |
|
| Importance level (for example, |
|
| Whether the message has been read. |
|
| Internet message ID of the parent message, from email headers. |
|
| Array of referenced message IDs, from email headers. |
|
| Preview of the email body. |
|
| Complete body content. Format is HTML or plain text, based on the |
|
| Unique body content, excluding quoted text from replies. |
|
| Date and time the message was received (ISO-8601). Used as the incremental sync cursor. |
|
| Date and time the message was sent (ISO-8601). |
|
| User-defined categories or tags on the message. |
|
| Array of attachment structs. Omitted when |
Attachment struct
Field | Type | Description |
|---|---|---|
|
| ID of the attachment from the Microsoft Graph API. |
|
| Original filename. |
|
| MIME type (for example, |
|
| File size in bytes. |
|
| Type indicator (for example, |
|
| Whether the attachment is inline (for example, an embedded image in a signature). |
|
| Base64-encoded file content. |
Connector options
These options are specified under outlook_options in the pipeline specification. See Filter combination logic for how multiple filter options interact.
Option | Type | Required | Default | Description |
|---|---|---|---|---|
|
| No | All accessible mailboxes | List of mailbox email addresses to sync. If not specified, the connector discovers and ingests all accessible mailboxes in the tenant using the Microsoft Graph |
|
| No |
| List of folder display names to sync. Examples: |
|
| No | All senders | Filter emails by sender email address using exact match. Example: |
|
| No | All subjects | Filter emails by subject line. Values ending with |
|
| No | Complete history from epoch | Start date for the initial sync in |
|
| No |
| Controls the email body content format. |
|
| No |
| Controls which attachments to ingest. |
Filter combination logic
An email message is ingested when it matches at least one value from each specified filter category. Multiple filter categories are combined with AND logic; values within a single category use OR logic.
Example: include_folders=["Inbox"] AND include_senders=["user@vendor.com", "alerts@system.io"] ingests emails from the Inbox folder that are sent by either user@vendor.com OR alerts@system.io.