Read and write data from Snowflake

Databricks has a Snowflake connector in the Databricks Runtime to enable reading and writing data from Snowflake.

Experimental

The legacy query federation documentation has been retired and might not be updated. The configurations mentioned in this content are not officially endorsed or tested by Databricks. If Lakehouse Federation supports your source database, Databricks recommends using that instead.

Query a Snowflake table in Databricks

You can configure a connection to Snowflake and then query data. Before you begin, check which version of Databricks Runtime your cluster runs on. The following code includes example syntax in Python, SQL, and Scala.

Python
SQL
Scala

Python

# The following example applies to Databricks Runtime 11.3 LTS and above.

snowflake_table = (spark.read
  .format("snowflake")
  .option("host", "hostname")
  .option("port", "port") # Optional - will use default port 443 if not specified.
  .option("user", "username")
  .option("password", "password")
  .option("sfWarehouse", "warehouse_name")
  .option("database", "database_name")
  .option("schema", "schema_name") # Optional - will use default schema "public" if not specified.
  .option("dbtable", "table_name")
  .load()
)

# The following example applies to Databricks Runtime 10.4 and below.

snowflake_table = (spark.read
  .format("snowflake")
  .option("dbtable", table_name)
  .option("sfUrl", database_host_url)
  .option("sfUser", username)
  .option("sfPassword", password)
  .option("sfDatabase", database_name)
  .option("sfSchema", schema_name)
  .option("sfWarehouse", warehouse_name)
  .load()
)

SQL

/* The following example applies to Databricks Runtime 11.3 LTS and above. */

DROP TABLE IF EXISTS snowflake_table;
CREATE TABLE snowflake_table
USING snowflake
OPTIONS (
    host '<hostname>',
    port '<port>', /* Optional - will use default port 443 if not specified. */
    user '<username>',
    password '<password>',
    sfWarehouse '<warehouse_name>',
    database '<database-name>',
    schema '<schema-name>', /* Optional - will use default schema "public" if not specified. */
    dbtable '<table-name>'
);
SELECT * FROM snowflake_table;

/* The following example applies to Databricks Runtime 10.4 LTS and below. */

DROP TABLE IF EXISTS snowflake_table;
CREATE TABLE snowflake_table
USING snowflake
OPTIONS (
    dbtable '<table-name>',
    sfUrl '<database-host-url>',
    sfUser '<username>',
    sfPassword '<password>',
    sfDatabase '<database-name>',
    sfSchema '<schema-name>',
    sfWarehouse '<warehouse-name>'
);
SELECT * FROM snowflake_table;

Scala

# The following example applies to Databricks Runtime 11.3 LTS and above.

val snowflake_table = spark.read
  .format("snowflake")
  .option("host", "hostname")
  .option("port", "port") /* Optional - will use default port 443 if not specified. */
  .option("user", "username")
  .option("password", "password")
  .option("sfWarehouse", "warehouse_name")
  .option("database", "database_name")
  .option("schema", "schema_name") /* Optional - will use default schema "public" if not specified. */
  .option("dbtable", "table_name")
  .load()

# The following example applies to Databricks Runtime 10.4 and below.

val snowflake_table = spark.read
  .format("snowflake")
  .option("dbtable", table_name)
  .option("sfUrl", database_host_url)
  .option("sfUser", username)
  .option("sfPassword", password)
  .option("sfDatabase", database_name)
  .option("sfSchema", schema_name)
  .option("sfWarehouse", warehouse_name)
  .load()

Write data to Snowflake

You can write a Spark DataFrame to a Snowflake table by using the same snowflake data source format with df.write. You can also issue INSERT INTO and CTAS statements against a Snowflake-backed table.

Python
SQL

Python
sf_options = {
  "host":        "<hostname>",
  "sfDatabase":  "<database-name>",
  "sfSchema":    "<schema-name>",
  "sfWarehouse": "<warehouse-name>",
  "sfRole":      "<role-name>",
  "sfUser":      "<username>",
  "sfPassword":  "<password>",
  "dbtable":     "<table-name>",
}

(df.write
   .format("snowflake")
   .options(**sf_options)
   .mode("append")
   .save())

SQL
CREATE TABLE snowflake_target
USING snowflake
OPTIONS (
  host        '<hostname>',
  sfUser      '<username>',
  sfPassword  '<password>',
  sfDatabase  '<database-name>',
  sfSchema    '<schema-name>',
  sfWarehouse '<warehouse-name>',
  sfRole      '<role-name>',
  dbtable     '<table-name>'
) AS SELECT * FROM source_view;

Writes on serverless compute

On serverless Spark and Databricks SQL, the Snowflake connector only supports the options listed in the following table.

Classic Databricks Runtime option	Serverless equivalent
`sfUrl`	`host` (plus optional `port`)
`user`, `password`	`sfUser`, `sfPassword`
`database`, `schema`, `warehouse`	`sfDatabase`, `sfSchema`, `sfWarehouse`
`preactions`, `postactions`	Not supported on serverless. Move logic into Snowflake stored procedures, tasks, or scheduled jobs.
`tempdir`	Not supported on serverless.

Classic Databricks Runtime option	Serverless equivalent
`sfUrl`	`host` (plus optional `port`)
`user`, `password`	`sfUser`, `sfPassword`
`database`, `schema`, `warehouse`	`sfDatabase`, `sfSchema`, `sfWarehouse`
`preactions`, `postactions`	Not supported on serverless. Move logic into Snowflake stored procedures, tasks, or scheduled jobs.
`tempdir`	Not supported on serverless.

Notebook example: Snowflake Connector for Spark

The following notebooks provide simple examples of how to write data to and read data from Snowflake. See Snowflake Connector for Spark for more details.

tip

Avoid exposing your Snowflake username and password in notebooks by using Secrets, which are demonstrated in the notebooks.

Snowflake Python notebook

Open notebook in new tab Open in Databricks

Notebook example: Save model training results to Snowflake

The following notebook walks through best practices for using the Snowflake Connector for Spark. It writes data to Snowflake, uses Snowflake for some basic data manipulation, trains a machine learning (ML) model in Databricks, and writes the results back to Snowflake.

Store ML training results in Snowflake notebook

Open notebook in new tab Open in Databricks

Frequently asked questions (FAQ)

Why don't my Spark DataFrame columns appear in the same order in Snowflake?

The Snowflake Connector for Spark doesn't respect the order of the columns in the table being written to. You must explicitly specify the mapping between DataFrame and Snowflake columns. To specify this mapping, use the columnmap parameter.

Why is `INTEGER` data written to Snowflake read back as `DECIMAL`?

Snowflake represents all INTEGER types as NUMBER, which can cause a change in data type when you write data to and read data from Snowflake. For example, Snowflake can convert INTEGER data to DECIMAL when writing, because INTEGER and DECIMAL are semantically equivalent in Snowflake (see Snowflake Numeric Data Types).

Why are the fields in my Snowflake table schema always uppercase?

Snowflake uses uppercase fields by default, which means that Snowflake converts the table schema to uppercase.

Query a Snowflake table in Databricks​

Write data to Snowflake​

Writes on serverless compute​

Notebook example: Snowflake Connector for Spark​

Snowflake Python notebook

Notebook example: Save model training results to Snowflake​

Store ML training results in Snowflake notebook

Frequently asked questions (FAQ)​

Why don't my Spark DataFrame columns appear in the same order in Snowflake?​

Why is INTEGER data written to Snowflake read back as DECIMAL?​

Why are the fields in my Snowflake table schema always uppercase?​