Configure Azure Database for MySQL for ingestion

Preview

The MySQL connector is in Public Preview. Contact your Databricks account team to request access.

Learn how to configure Azure Database for MySQL for ingestion into Databricks. You must enable binary logging and configure binlog retention to support change data capture.

Requirements

Azure Database for MySQL version 5.7.44 and later, or 8.0 and later (Flexible Server).
Permission to modify server parameters.

Configure server parameters

Binary logging is controlled through server parameters in Azure Database for MySQL. You must configure these parameters through the Azure portal.

Required server parameters

Configure the following server parameters:

Parameter	Value	Description
`log_bin`	`ON`	Enables binary logging. `log_bin` is enabled by default in Azure MySQL Flexible Server and is not configurable. If this option is available, set it to `ON`.
`binlog_format`	`ROW`	Sets the binary log format to row-based replication. Required for change data capture.
`binlog_row_image`	`FULL`	Logs all columns in the binary log, not just changed columns.
`binlog_expire_logs_seconds`	Minimum: `86400` (one day). Recommended: `604800` (seven days)	Specifies how long binary logs are kept before automatic purging.

Azure portal

To configure server parameters using the Azure portal:

Sign in to the Azure portal and navigate to your Azure Database for MySQL server.
In the left menu, under Settings, select Server parameters.
Search for binlog_format and set the value to ROW.
Search for binlog_row_image and set the value to FULL.
Search for binlog_expire_logs_seconds and set the minimum value to 86400 (one day). Databricks recommends setting it to 604800 (seven days).
Click Save at the top of the page.
When prompted, restart the server for the changes to take effect.

Binlog retention considerations

The minimum binlog retention period is one day (24 hours or 86400 seconds). Databricks recommends a binlog retention period of seven days (168 hours or 604800 seconds). If you set a lower value, binlogs might be deleted before the ingestion gateway processes them, requiring a full refresh of all tables.

A full refresh is also required if the gateway cannot continuously process binlogs. If the gateway is stopped for longer than binlog_expire_logs_seconds, binlogs might be deleted before ingestion resumes.

Verify binlog configuration

After configuring the server parameters and restarting the server, verify that binary logging is properly configured:

SQL
-- Check if binary logging is enabled
SHOW VARIABLES LIKE 'log_bin';

-- Should return:
-- +---------------+-------+
-- | Variable_name | Value |
-- +---------------+-------+
-- | log_bin       | ON    |
-- +---------------+-------+

-- Check binlog format
SHOW VARIABLES LIKE 'binlog_format';

-- Should return:
-- +---------------+-------+
-- | Variable_name | Value |
-- +---------------+-------+
-- | binlog_format | ROW   |
-- +---------------+-------+

-- Check binlog row image
SHOW VARIABLES LIKE 'binlog_row_image';

-- Should return:
-- +------------------+-------+
-- | Variable_name    | Value |
-- +------------------+-------+
-- | binlog_row_image | FULL  |
-- +------------------+-------+

-- Check retention in seconds
SHOW VARIABLES LIKE 'binlog_expire_logs_seconds';

Read replica support

The MySQL connector supports ingesting from read replicas for Azure Database for MySQL. Using a read replica can reduce load on your primary database.

To use a read replica:

Create a read replica of your primary server in the Azure portal.
Verify that the read replica has binary logging enabled (inherited from the primary server).
Use the read replica's connection string when creating the Databricks connection.

note

There might be a replication lag between the primary server and the replica, which can affect data freshness. Monitor replication lag using Azure Monitor metrics.

Network configuration

Ensure that your Azure Database for MySQL server is accessible from Databricks:

Configure firewall rules to allow inbound traffic from Databricks IP ranges.
If using a private endpoint, ensure proper virtual network peering or connectivity.
Consider using Azure Private Link for secure connectivity.

To add a firewall rule:

In the Azure portal, navigate to your MySQL server.
Under Settings, select Networking.
Add a new firewall rule with the Databricks IP ranges.
Click Save.

Next steps

An admin can now either:

Use Catalog Explorer to create a connection so non-admins can create pipelines
Use the data ingestion UI to create a connection and a pipeline

Requirements​

Configure server parameters​

Required server parameters​

Azure portal​

Binlog retention considerations​

Verify binlog configuration​

Read replica support​

Network configuration​

Next steps​