Confluence connector reference
The Confluence connector is in Beta.
This page contains reference material for the Confluence connector in Databricks Lakeflow Connect.
General connector behavior
Page hierarchy is preserved through parent-child relationship fields in the pages table.
Automatic data transformations
Databricks automatically transforms the following Confluence data types to Delta-compatible data types.
Schemas
pages
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the page. |
|
| Current lifecycle state of the page (e.g., current, draft, archived). |
|
| Title of the content as shown in the Confluence UI. |
|
| Timestamp when the blog post was last modified. This is used as the cursor column. |
|
| ID of the parent content (e.g., page or blog post) if this content is nested. |
|
| Type of parent content (page, blogpost, etc.). |
|
| Location index of a page within a list of sibling pages or content. |
|
| ID of the user who originally created the content. |
|
| ID of the current owner of the content (may differ from the author). |
|
| ID of the previous owner of the content. |
|
| Timestamp when the content was initially created. |
|
| ID of the space to which the content belongs. |
|
| Container that holds the actual content of the page in one or more representations. |
|
| Raw XHTML content format stored in Confluence. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| JSON format for pages made in the new editor. |
|
| Format type of the content (e.g., storage for raw format, view for rendered HTML, editor for the legacy editor, etc.). |
|
| The actual content string or structure. |
|
| URLs for viewing, editing, or accessing content via the UI or API. |
|
| Link to view the page in the normal Confluence UI. |
|
| Link to edit the page in the legacy editor. |
|
| Short, shareable URL for the page. |
|
| Link to edit the page in the new (fabric) editor. |
|
| Indicates whether the content is deleted (true) or not (false). |
spaces
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the space. |
|
| Unique identifier string for a space, used in URLs like https://your-site.atlassian.net/wiki/spaces/{KEY} |
|
| Display name of the space (e.g., "Engineering", "Marketing Docs"). |
|
| Type of space (usually global or personal). |
|
| Current lifecycle state of the space (e.g., current, draft, archived). |
|
| ID of the user who created the space. |
|
| Timestamp when the space was created. |
|
| ID of the home page for this space. |
|
| Container for different representations of the space description (e.g., plain for unformatted text, view for rendered HTML, etc.). |
|
| Text-only representation of content, with no formatting (used under fields like description.plain). |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Rendered HTML view of the description as seen in the UI. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Icon metadata associated with the space (e.g., custom logo or default avatar). |
|
| Relative path to the space's icon or base page (used in URLs). |
|
| API endpoint to download the space icon or attachment (if applicable). |
|
| URLs for viewing, editing, or accessing content via the UI or API. |
|
| Link to view the page in the normal Confluence UI. |
labels
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the label. |
|
| The label's actual text value (e.g., engineering, draft). |
|
| The type of label, indicating scope (e.g., global, my). |
classiciation_levels
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier for the classification level. |
|
| URL-friendly string used as a unique key for the level. |
|
| Human-readable name of the classification level (e.g., "Confidential"). |
|
| Type or category of the classification level. |
|
| Current lifecycle status (e.g., active, archived). |
|
| ID of the user who created the classification level. |
|
| Timestamp when the classification level was created. |
|
| ID of the associated homepage or main content, if applicable. |
|
| A container for different representations of the classification level description (e.g., plain for unformatted text, view for rendered HTML, etc.). |
|
| Plain-text version of the description (no formatting). |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| HTML-rendered version of the description for display purposes. |
|
| Specifies the format type for the content. |
|
| The actual content string (plain text, HTML, or storage XHTML depending on the representation). |
|
| Icon metadata for the classification level (e.g., URL, path, size). |
|
| Relative path to the classification icon or main page. |
|
| API endpoint to download the icon or attachment, if present. |
|
| Collection of related web or API links for this classification level. |
|
| Link to view the classification level in the Confluence UI. |
blogposts
Field | Data type | Notes |
|---|---|---|
|
| Unique identifier of the blog post. |
|
| Lifecycle state (e.g., current, draft, archived). |
|
| Title of the blog post. |
|
| Timestamp when the blog post was last modified. This is used as the cursor column. |
|
| ID of the space the blog post belongs to. |
|
| ID of the user who created the blog post. |
|
| Timestamp when the blog post was created. |
|
| Container for the actual content of the blog post in one or more formats. |
|
| Contains URLs for viewing or editing the blog post. |
|
| Link to view the blog post in the standard Confluence UI. |
|
| Link to edit the blog post in the legacy editor. |
|
| Short, shareable URL for the blog post. |
|
| Indicates whether the blog post is deleted (true) or not (false). |
attachments
Field | Data type | Notes |
|---|---|---|
|
| Lifecycle state of the attachment (e.g., current, deleted). |
|
| Filename/title of the attachment. |
|
| Timestamp when the attachment was uploaded. |
|
| Timestamp of the last modification to the attachment. This is used as the cursor column. |
|
| ID of the page that the attachment is linked to. |
|
| ID of the blog post that the attachment is linked to (if applicable; NULL if not). |
|
| ID for custom content types using attachments. Typically used when it's not linked to a page or blog post—i.e., a non-standard content type (e.g., a whiteboard created with the Confluence whiteboards feature). |
|
| MIME type of the file (e.g., image/png, application/pdf). |
|
| Human-readable description of the file type (e.g., "PNG image"). |
|
| Optional comment or note added to the attachment. |
|
| Unique ID of the attachment file itself. |
|
| Size of the file in bytes. |
|
| Link to view the attachment in the Confluence UI. |
|
| Direct URL to download the attachment. |
|
| Object containing structured links related to the attachment. |
|
| Relative link to view the attachment in the web UI. |
|
| Relative link to download the attachment via the UI or API. |
|
| Indicates whether the attachment has been deleted. |