Skip to main content

Database connectors in Lakeflow Connect

Databricks Lakeflow Connect provides fully-managed connectors for ingesting data from relational databases using change data capture (CDC). Each connector efficiently tracks changes in the source database and applies them incrementally to Delta tables.

Supported connectors

    • MySQL
    • Ingest data from MySQL databases using change data capture (CDC) for efficient incremental loads.
    • PostgreSQL
    • Ingest data from PostgreSQL databases using change data capture (CDC).
    • Microsoft SQL Server
    • Ingest data from Microsoft SQL Server using change data capture (CDC) or full snapshot.

Connector components

A database connector has the following components:

Component

Description

Connection

A Unity Catalog securable object that stores authentication details for the database.

Ingestion gateway

A pipeline that extracts snapshots, change logs, and metadata from the source database. The gateway runs on classic compute, and it runs continuously to capture changes before change logs can be truncated in the source.

Staging storage

A Unity Catalog volume that temporarily stores extracted data before it's applied to the destination table. This allows you to run your ingestion pipeline at whatever schedule you'd like, even as the gateway continuously captures changes. It also helps with failure recovery. You automatically create a staging storage volume when you deploy the gateway, and you can customize the catalog and schema where it lives. Data is automatically purged from staging after 30 days.

Ingestion pipeline

A pipeline that moves the data from staging storage into the destination tables. The pipeline runs on serverless compute.

Destination tables

The tables where the ingestion pipeline writes the data. These are streaming tables, which are Delta tables with extra support for incremental data processing.

Database connector components diagram

Release statuses

Connector

Release status

MySQL

Public Preview

PostgreSQL

Public Preview

SQL Server

Generally Available

Feature availability

The following tables summarize feature availability for each database connector. For additional features and limitations, see the documentation for your specific connector.

MySQL

Feature

Availability

UI-based pipeline authoring

check marked yes Supported

API-based pipeline authoring

check marked yes Supported

Declarative Automation Bundles

check marked yes Supported

Incremental ingestion

check marked yes Supported

Unity Catalog governance

check marked yes Supported

Orchestration using Databricks Workflows

check marked yes Supported

SCD type 2

check marked yes Supported

API-based column selection and deselection

check marked yes Supported

API-based row filtering

x mark no Not supported

Automated schema evolution: New and deleted columns

check marked yes Supported

Automated schema evolution: Data type changes

x mark no Not supported

Automated schema evolution: Column renames

check marked yes Supported

Treated as a new column (new name) and deleted column (old name).

Automated schema evolution: New tables

check marked yes Supported

If you ingest the entire schema. See the limitations on the number of tables per pipeline.

Maximum number of tables per pipeline

250

PostgreSQL

Feature

Availability

UI-based pipeline authoring

check marked yes Supported

API-based pipeline authoring

check marked yes Supported

Declarative Automation Bundles

check marked yes Supported

Incremental ingestion

check marked yes Supported

Unity Catalog governance

check marked yes Supported

Orchestration using Databricks Workflows

check marked yes Supported

SCD type 2

check marked yes Supported

API-based column selection and deselection

check marked yes Supported

API-based row filtering

x mark no Not supported

Automated schema evolution: New and deleted columns

check marked yes Supported

Automated schema evolution: Data type changes

x mark no Not supported

Automated schema evolution: Column renames

x mark no Not supported

Requires a full refresh.

Automated schema evolution: New tables

check marked yes Supported

If you ingest the entire schema. See the limitations on the number of tables per pipeline.

Maximum number of tables per pipeline

250

SQL Server

Feature

Availability

UI-based pipeline authoring

Yes

API-based pipeline authoring

Yes

Declarative Automation Bundles

Yes

Incremental ingestion

Yes

Unity Catalog governance

Yes

Orchestration using Databricks Workflows

Yes

SCD type 2

Yes

API-based column selection and deselection

Yes

API-based row filtering

No

Automated schema evolution: New and deleted columns

Yes

Automated schema evolution: Data type changes

No

Automated schema evolution: Column renames

No - Requires full refresh.

Automated schema evolution: New tables

Yes - If you ingest the entire schema. See the limitations on the number of tables per pipeline.

Maximum number of tables per pipeline

250

Authentication methods

The following table lists the supported authentication methods for each database connector. Databricks recommends using OAuth U2M or OAuth M2M when possible. If your connector supports OAuth, basic authentication is considered a legacy method.

MySQL

Authentication method

Availability

OAuth U2M

x mark no Not supported

OAuth M2M

x mark no Not supported

OAuth (manual refresh token)

x mark no Not supported

Basic authentication (username/password)

check marked yes Supported

Basic authentication (API key)

x mark no Not supported

Basic authentication (service account JSON key)

x mark no Not supported

PostgreSQL

Authentication method

Availability

OAuth U2M

x mark no Not supported

OAuth M2M

x mark no Not supported

OAuth (manual refresh token)

x mark no Not supported

Basic authentication (username/password)

check marked yes Supported

Basic authentication (API key)

x mark no Not supported

Basic authentication (service account JSON key)

x mark no Not supported

SQL Server

Authentication method

Availability

OAuth U2M

check marked yes Supported

OAuth M2M

check marked yes Supported

OAuth (manual refresh token)

x mark no Not supported

Basic authentication (username/password)

check marked yes Supported

Basic authentication (API key)

x mark no Not supported

Basic authentication (service account JSON key)

x mark no Not supported