Skip to main content

Configure Google Cloud SQL for MySQL for ingestion

Preview

The MySQL connector is in Public Preview. Contact your Databricks account team to request access.

This article describes how to configure Google Cloud SQL for MySQL for ingestion. You must enable binary logging and configure binlog retention to support change data capture.

Requirements

  • Cloud SQL for MySQL version 5.7.44 and later, or 8.0 and later.
  • A Cloud SQL instance.

Configure database flags

Binary logging is controlled through database flags in Cloud SQL for MySQL. You must configure these flags through the Google Cloud Console.

Required database flags

Configure the following database flags:

Flag

Value

Description

log_bin

on

Enables binary logging.

binlog_format

ROW

Sets the binary log format to row-based replication. Required for change data capture.

binlog_row_image

FULL

Logs all columns in the binary log, not just changed columns.

binlog_expire_logs_seconds

604800 (7 days)

Specifies how long binary logs are kept before automatic purging. For MySQL 5.7, use expire_logs_days instead.

Google Cloud Console

To configure database flags using the Google Cloud Console:

  1. Open the Google Cloud Console and navigate to the Cloud SQL Instances page.
  2. Select your Cloud SQL instance to open the instance details page.
  3. Click Edit at the top of the page.
  4. Scroll down to the Flags section.
  5. Click Add flag and select binlog_format. Set the value to ROW.
  6. Click Add flag and select binlog_row_image. Set the value to FULL.
  7. For MySQL 8.0, click Add flag and select binlog_expire_logs_seconds. Set the value to 604800 or higher.
  8. For MySQL 5.7.44 and later, click Add flag and select expire_logs_days. Set the value to 7 or higher.
  9. Click Save at the bottom of the page.
  10. When prompted, confirm that you want to restart the instance for the changes to take effect.

Binlog retention considerations

Databricks recommends a binlog retention period of 7 days (168 hours or 604800 seconds). Setting a lower value might cause binlogs to be cleaned up before the ingestion gateway replays them, which would require a full refresh of all tables.

For MySQL 5.7.44 and later, use the expire_logs_days flag (integer value in days).

For MySQL 8.0 and later, use the binlog_expire_logs_seconds flag (integer value in seconds).

Verify the binlog configuration

After configuring the database flags and restarting the instance, verify that binary logging is properly configured. Connect to your Cloud SQL instance using a MySQL client and run:

SQL
-- Check if binary logging is enabled
SHOW VARIABLES LIKE 'log_bin';

-- Should return:
-- +---------------+-------+
-- | Variable_name | Value |
-- +---------------+-------+
-- | log_bin | ON |
-- +---------------+-------+

-- Check binlog format
SHOW VARIABLES LIKE 'binlog_format';

-- Should return:
-- +---------------+-------+
-- | Variable_name | Value |
-- +---------------+-------+
-- | binlog_format | ROW |
-- +---------------+-------+

-- Check binlog row image
SHOW VARIABLES LIKE 'binlog_row_image';

-- Should return:
-- +------------------+-------+
-- | Variable_name | Value |
-- +------------------+-------+
-- | binlog_row_image | FULL |
-- +------------------+-------+

-- For MySQL 8.0: Check retention in seconds
SHOW VARIABLES LIKE 'binlog_expire_logs_seconds';

-- For MySQL 5.7: Check retention in days
SHOW VARIABLES LIKE 'expire_logs_days';

Network configuration

Ensure that your Cloud SQL instance is accessible from Databricks:

  • Public IP: Add authorized networks to allow connections from Databricks IP ranges.
  • Private IP: Configure VPC peering between your Google Cloud VPC and Databricks.
  • Cloud SQL Proxy: Consider using Cloud SQL Auth Proxy for secure connections.

Add authorized networks (public IP)

To add authorized networks for public IP connectivity:

  1. In the Cloud Console, navigate to your Cloud SQL instance.
  2. Select Connections from the left menu.
  3. Under Authorized networks, click Add network.
  4. Enter the Databricks IP ranges.
  5. Click Done and then Save.

Automated backups

Cloud SQL automatically enables binary logging when automated backups are enabled. Verify that automated backups are enabled:

  1. In the Cloud Console, navigate to your Cloud SQL instance.
  2. Select Backups from the left menu.
  3. Verify that Automated backups is enabled.
  4. If disabled, click Edit configuration and enable automated backups.

Next steps

An admin can now either: