Confluence connector reference
This page contains reference material for the Confluence connector in Lakeflow Connect.
General connector behavior
Page hierarchy is preserved through parent-child relationship fields in the pages table.
Automatic data transformations
Databricks automatically transforms the following Confluence data types to Delta-compatible data types.
Schemas
pages
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the page. |
|
| Current lifecycle state of the page (for example, current, draft, archived). |
|
| Title of the content as shown in the Confluence UI. |
|
| Timestamp when the blog post was last modified. This is used as the cursor column. |
|
| ID of the parent content (for example, page or blog post) if this content is nested. |
|
| Type of parent content (for example, page, blogpost). |
|
| Location index of a page within a list of sibling pages or content. |
|
| ID of the user who originally created the content. |
|
| ID of the current owner of the content (might differ from the author). |
|
| ID of the previous owner of the content. |
|
| Timestamp when the content was initially created. |
|
| ID of the space to which the content belongs. |
|
| Container that holds the actual content of the page in one or more representations. |
|
| Raw XHTML content format stored in Confluence. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| JSON format for pages made in the new editor. |
|
| Format type of the content (for example, storage for raw format, view for rendered HTML, editor for the legacy editor). |
|
| The actual content string or structure. |
|
| URLs for viewing, editing, or accessing content using the UI or API. |
|
| Link to view the page in the normal Confluence UI. |
|
| Link to edit the page in the legacy editor. |
|
| Short, shareable URL for the page. |
|
| Link to edit the page in the new (fabric) editor. |
|
| Indicates whether the content is deleted (true) or not (false). |
spaces
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the space. |
|
| Unique identifier string for a space, used in URLs like |
|
| Display name of the space (for example, "Engineering", "Marketing Docs"). |
|
| Type of space (usually global or personal). |
|
| Current lifecycle state of the space (for example, current, draft, archived). |
|
| ID of the user who created the space. |
|
| Timestamp when the space was created. |
|
| ID of the home page for this space. |
|
| Container for different representations of the space description (for example, plain for unformatted text, view for rendered HTML). |
|
| Text-only representation of content, with no formatting (used under fields like description.plain). |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Rendered HTML view of the description as seen in the UI. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Icon metadata associated with the space (for example, custom logo or default avatar). |
|
| Relative path to the space's icon or base page (used in URLs). |
|
| API endpoint to download the space icon or attachment (if applicable). |
|
| URLs for viewing, editing, or accessing content using the UI or API. |
|
| Link to view the page in the normal Confluence UI. |
labels
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the label. |
|
| The label's actual text value (for example, engineering, draft). |
|
| The type of label, indicating scope (for example, global, my). |
classification_levels
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier for the classification level. |
|
| URL-friendly string used as a unique key for the level. |
|
| Human-readable name of the classification level (for example, "Confidential"). |
|
| Type or category of the classification level. |
|
| Current lifecycle status (for example, active or archived). |
|
| ID of the user who created the classification level. |
|
| Timestamp when the classification level was created. |
|
| ID of the associated homepage or main content, if applicable. |
|
| A container for different representations of the classification level description (for example, plain for unformatted text or view for rendered HTML). |
|
| Plain-text version of the description (no formatting). |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| HTML-rendered version of the description for display purposes. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Icon metadata for the classification level (for example, URL, path, size). |
|
| Relative path to the classification icon or main page. |
|
| API endpoint to download the icon or attachment, if present. |
|
| Collection of related web or API links for this classification level. |
|
| Link to view the classification level in the Confluence UI. |
blogposts
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the blog post. |
|
| Lifecycle state (for example, current, draft, archived). |
|
| Title of the blog post. |
|
| Timestamp when the blog post was last modified. This is used as the cursor column. |
|
| ID of the space the blog post belongs to. |
|
| ID of the user who created the blog post. |
|
| Timestamp when the blog post was created. |
|
| Container for the actual content of the blog post in one or more formats. |
|
| Contains URLs for viewing or editing the blog post. |
|
| Link to view the blog post in the standard Confluence UI. |
|
| Link to edit the blog post in the legacy editor. |
|
| Short, shareable URL for the blog post. |
|
| Indicates whether the blog post is deleted (true) or not (false). |
attachments
Field | Data type | Notes |
|---|---|---|
|
| Lifecycle state of the attachment (for example, current, deleted). |
|
| Filename/title of the attachment. |
|
| Timestamp when the attachment was uploaded. |
|
| Timestamp of the last modification to the attachment. This is used as the cursor column. |
|
| ID of the page that the attachment is linked to. |
|
| ID of the blog post that the attachment is linked to (if applicable, NULL if not). |
|
| ID for custom content types using attachments. Typically used when it's not linked to a page or blog post (a non-standard content type, for example, a whiteboard created with the Confluence whiteboards feature). |
|
| MIME type of the file (for example, image/png, application/pdf). |
|
| Human-readable description of the file type (for example, "PNG image"). |
|
| Optional comment or note added to the attachment. |
|
| Unique ID of the attachment file itself. |
|
| Size of the file in bytes. |
|
| Link to view the attachment in the Confluence UI. |
|
| Direct URL to download the attachment. |
|
| Object containing structured links related to the attachment. |
|
| Relative link to view the attachment in the web UI. |
|
| Relative link to download the attachment using the UI or API. |
|
| Indicates whether the attachment has been deleted. |