MySQL connector FAQ

Preview

The MySQL connector is in Public Preview. Contact your Databricks account team to request access.

Find answers to frequently asked questions about the MySQL connector.

What MySQL versions and platforms are supported?

The MySQL connector supports the following versions and platforms:

Amazon RDS for MySQL: Version 5.7.44 and later (both standalone and HA deployments)
Amazon Aurora MySQL: Version 5.7.mysql_aurora.2.12.2 and later (for HA setups, support is only from primary instance)
Amazon Aurora MySQL Serverless: Supported
Azure Database for MySQL Flexible Servers: Version 5.7.44 and later (both standalone and HA deployments)
Google Cloud SQL for MySQL: Version 5.7.44 and later
MySQL on EC2: Version 5.7.44 and later

What authentication methods are supported?

The MySQL connector supports the following authentication plugins based on your MySQL version:

MySQL 5.7.44: Only sha256_password is supported. The replication user must be created using this authentication plugin.
MySQL 8.0 and later: Both sha256_password and caching_sha2_password are supported.

Does the connector support GTID-based replication?

No, the MySQL connector doesn't support GTID (Global Transaction Identifier)-based replication. The connector uses position-based binlog replication.

You can still use the connector if GTID is enabled on your MySQL server, but the connector uses a binlog file and position-based replication regardless.

Are XA transactions supported?

No, the MySQL connector doesn't support XA transactions (distributed transactions). If an XA transaction is performed, the table is skipped from the pipeline.

Can I ingest tables with spatial data types?

No, spatial data types (GEOMETRY, POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION) aren't supported.

If a table contains spatial columns, you must exclude the entire table from ingestion. The tables with spatial types are skipped when they are detected or when a new column with spatial types is added.

Can I create multiple pipelines with the same destination table?

No, one destination table can be managed by only one Managed Ingestion pipeline. You can't create two different Managed Ingestion pipelines with overlapping destination tables.

Can I ingest tables with the same name from different schemas?

No, you can't ingest two tables with the same name in the same pipeline, even if they come from different source schemas. For example, you can't ingest both schema1.customers and schema2.customers in one pipeline.

To work around this, see Create multi-destination pipelines.

How do I rotate MySQL credentials?

To rotate credentials for an existing connection:

Update the password in MySQL
In Databricks, go to Catalog Explorer
Navigate to the connection
Click Edit and update the password
Save the changes

The ingestion gateway and pipeline will automatically use the new credentials on the next run.

Are table and column names case-sensitive?

Yes, MySQL table and schema names are case-sensitive in the MySQL connector. MySQL schema and table names are case sensitive, whereas Unity Catalog catalog, schema, and table names are case insensitive. If there is a conflict due to case (for example, mytable vs MyTable), use multi destination functionality to resolve conflicts.

For details, see Identifier Case Sensitivity in the MySQL documentation.

What scale has been tested for the MySQL connector?

The connector has been tested for 100 tables in a single pipeline with total snapshot data less than 1 TB.

Machine type recommendations for the ingestion gateway:

The following are the default machine types:

AWS: r5n.xlarge
AZURE: Standard_E4d_v4
GCP: n2-highmem-4

Consider using r5n.2xlarge, Standard_E8d_v4, or n2-highmem-8 for better snapshot performance.

Pipeline limits:

Databricks recommends 250 or fewer tables per pipeline
Hard limit: 1,000 flows per pipeline (which effectively supports up to 500 tables)

Note: While the hard limit supports up to 500 tables, Databricks recommends 250 or fewer for optimal performance.

Does the ingestion gateway support triggered mode execution?

No, the ingestion gateway does not support triggered mode and must run continuously to avoid the need for full refreshes due to log cleanups.

The ingestion pipeline can run on a schedule or be triggered, but the gateway must remain running continuously.

When should I perform a full refresh?

Perform a full refresh in the following scenarios:

When a table flow is marked as failed in the ingestion pipeline.
When an incompatible schema change causes the table flow to fail.
When binlog files are cleaned up before the ingestion gateway replays them.
When you manually need to re-sync a table.

To perform a full refresh, see Fully refresh target tables.

What should I do if binlog files are cleaned up before the ingestion gateway replays them?

If binlog files are purged before the ingestion gateway processes them:

The ingestion gateway will detect this event and skip all affected tables
Each skipped table will have appropriate DLT event logs indicating the issue
You must trigger a full refresh for all affected tables in the pipeline

To prevent this from happening:

Configure adequate binlog retention (one day minimum, seven days recommended)
Ensure the ingestion gateway runs continuously

Does the connector support on-premises MySQL deployments?

Yes, on-premises MySQL deployments are supported when connected to a Databricks workspace through:

Azure ExpressRoute
AWS Direct Connect
VPN connection

Ensure that:

Sufficient network bandwidth is available for data transfer
Network connectivity is stable and reliable
Firewall rules allow traffic on the MySQL port (default 3306)
Binlog retention is configured appropriately

For more information, contact the Databricks Support team.

What MySQL versions and platforms are supported?​

What authentication methods are supported?​

Does the connector support GTID-based replication?​

Are XA transactions supported?​

Can I ingest tables with spatial data types?​

Can I create multiple pipelines with the same destination table?​

Can I ingest tables with the same name from different schemas?​

How do I rotate MySQL credentials?​

Are table and column names case-sensitive?​

What scale has been tested for the MySQL connector?​

Does the ingestion gateway support triggered mode execution?​

When should I perform a full refresh?​

What should I do if binlog files are cleaned up before the ingestion gateway replays them?​

Does the connector support on-premises MySQL deployments?​