Query MySQL with Databricks

This example queries MySQL using its JDBC driver. For more details on reading, writing, configuring parallelism, and query pushdown, see Query databases using JDBC.

Note

You may prefer Lakehouse Federation for managing queries to MySQL. See Run queries using Lakehouse Federation.

Using JDBC

driver = "org.mariadb.jdbc.Driver"

database_host = "<database-host-url>"
database_port = "3306" # update if you use a non-default port
database_name = "<database-name>"
table = "<table-name>"
user = "<username>"
password = "<password>"

url = f"jdbc:mysql://{database_host}:{database_port}/{database_name}"

remote_table = (spark.read
  .format("jdbc")
  .option("driver", driver)
  .option("url", url)
  .option("dbtable", table)
  .option("user", user)
  .option("password", password)
  .load()
)
val driver = "org.mariadb.jdbc.Driver"

val database_host = "<database-host-url>"
val database_port = "3306" # update if you use a non-default port
val database_name = "<database-name>"
val table = "<table-name>"
val user = "<username>"
val password = "<password>"

val url = s"jdbc:mysql://${database_host}:${database_port}/${database_name}"

val remote_table = spark.read
  .format("jdbc")
  .option("driver", driver)
  .option("url", url)
  .option("dbtable", table)
  .option("user", user)
  .option("password", password)
  .load()

Using the MySQL connector in Databricks Runtime

In Databricks Runtime 11.2 and above, you can use the named connector to query MySQL. See the following examples:

remote_table = (spark.read
  .format("mysql")
  .option("dbtable", table_name)
  .option("host", database_host_url)
  .option("port", 3306)
  .option("database", database_name)
  .option("user", username)
  .option("password", password)
  .load()
)
DROP TABLE IF EXISTS mysql_table;
CREATE TABLE mysql_table
USING mysql
OPTIONS (
  dbtable '<table-name>',
  host '<database-host-url>',
  port '3306',
  database '<database-name>',
  user '<username>',
  password '<password>'
);
SELECT * from mysql_table;
val remote_table = spark.read
  .format("mysql")
  .option("dbtable", table_name)
  .option("host", database_host_url)
  .option("port", 3306)
  .option("database", database_name)
  .option("user", username)
  .option("password", password)
  .load()