MySQL connector FAQ
The MySQL connector is in Public Preview. Contact your Databricks account team to request access.
Find answers to frequently asked questions about the MySQL connector.
What MySQL versions and platforms are supported?
The MySQL connector supports the following versions and platforms:
- Amazon RDS for MySQL: Version 5.7.44 and later (both standalone and HA deployments)
- Amazon Aurora MySQL: Version 5.7.mysql_aurora.2.12.2 and later (for HA setups, support is only from primary instance)
- Amazon Aurora MySQL Serverless: Supported
- Azure Database for MySQL Flexible Servers: Version 5.7.44 and later (both standalone and HA deployments)
- Google Cloud SQL for MySQL: Version 5.7.44 and later
- MySQL on EC2: Version 5.7.44 and later
What authentication methods are supported?
The MySQL connector supports the following authentication plugins based on your MySQL version:
- MySQL 5.7.44: Only
sha256_passwordis supported. The replication user must be created using this authentication plugin. - MySQL 8.0 and later: Both
sha256_passwordandcaching_sha2_passwordare supported.
Does the connector support GTID-based replication?
No, the MySQL connector doesn't support GTID (Global Transaction Identifier)-based replication. The connector uses position-based binlog replication.
You can still use the connector if GTID is enabled on your MySQL server, but the connector uses a binlog file and position-based replication regardless.
Are XA transactions supported?
No, the MySQL connector doesn't support XA transactions (distributed transactions). If an XA transaction is performed, the table is skipped from the pipeline.
Can I ingest tables with spatial data types?
No, spatial data types (GEOMETRY, POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION) aren't supported.
If a table contains spatial columns, you must exclude the entire table from ingestion. The tables with spatial types are skipped when they are detected or when a new column with spatial types is added.
Can I create multiple pipelines with the same destination table?
No, one destination table can be managed by only one Managed Ingestion pipeline. You can't create two different Managed Ingestion pipelines with overlapping destination tables.
Can I ingest tables with the same name from different schemas?
No, you can't ingest two tables with the same name in the same pipeline, even if they come from different source schemas. For example, you can't ingest both schema1.customers and schema2.customers in one pipeline.
To work around this, see Create multi-destination pipelines.
How do I rotate MySQL credentials?
To rotate credentials for an existing connection:
- Update the password in MySQL
- In Databricks, go to Catalog Explorer
- Navigate to the connection
- Click Edit and update the password
- Save the changes
The ingestion gateway and pipeline will automatically use the new credentials on the next run.
Are table and column names case-sensitive?
Yes, MySQL table and schema names are case-sensitive in the MySQL connector. MySQL schema and table names are case sensitive, whereas Unity Catalog catalog, schema, and table names are case insensitive. If there is a conflict due to case (for example, mytable vs MyTable), use multi destination functionality to resolve conflicts.
For details, see Identifier Case Sensitivity in the MySQL documentation.
What scale has been tested for the MySQL connector?
The connector has been tested for 100 tables in a single pipeline with total snapshot data less than 1 TB.
Machine type recommendations for the ingestion gateway:
The following are the default machine types:
- AWS:
r5n.xlarge - AZURE:
Standard_E4d_v4 - GCP:
n2-highmem-4
Consider using r5n.2xlarge, Standard_E8d_v4, or n2-highmem-8 for better snapshot performance.
Pipeline limits:
- Databricks recommends 250 or fewer tables per pipeline
- Hard limit: 1,000 flows per pipeline (which effectively supports up to 500 tables)
Note: While the hard limit supports up to 500 tables, Databricks recommends 250 or fewer for optimal performance.
Does the ingestion gateway support triggered mode execution?
No, the ingestion gateway pipeline does not support triggered mode and must run continuously to avoid the need for full refreshes due to log cleanups.
The ingestion pipeline (not the gateway) can run on a schedule or be triggered, but the gateway must remain running continuously.
When should I perform a full refresh?
Perform a full refresh in the following scenarios:
- When a table flow is marked as failed in the ingestion pipeline.
- When an incompatible schema change causes the table flow to fail.
- When binlog files are cleaned up before the ingestion gateway replays them.
- When you manually need to re-sync a table.
To perform a full refresh, see Fully refresh target tables.
What should I do if binlog files are cleaned up before the ingestion gateway replays them?
If binlog files are purged before the ingestion gateway processes them:
- The ingestion gateway will detect this event and skip all affected tables
- Each skipped table will have appropriate DLT event logs indicating the issue
- You must trigger a full refresh for all affected tables in the pipeline
To prevent this from happening:
- Configure adequate binlog retention (7 days recommended)
- Ensure the ingestion gateway runs continuously
Does the connector support on-premises MySQL deployments?
Yes, on-premises MySQL deployments are supported when connected to a Databricks workspace through:
- Azure ExpressRoute
- AWS Direct Connect
- VPN connection
Ensure that:
- Sufficient network bandwidth is available for data transfer
- Network connectivity is stable and reliable
- Firewall rules allow traffic on the MySQL port (default 3306)
- Binlog retention is configured appropriately
For more information, contact the Databricks Support team.