Read and write data from Snowflake
Databricks provides a Snowflake connector in the Databricks Runtime to support reading and writing data from Snowflake.
Note
You may prefer Lakehouse Federation for managing queries on Snowflake data. See Run queries using Lakehouse Federation.
Query a Snowflake table in Databricks
You can configure a connection to Snowflake and then query data. The following code provides example syntax in Python, SQL, and Scala:
snowflake_table = (spark.read
.format("snowflake")
.option("dbtable", table_name)
.option("sfUrl", database_host_url)
.option("sfUser", username)
.option("sfPassword", password)
.option("sfDatabase", database_name)
.option("sfSchema", schema_name)
.option("sfWarehouse", warehouse_name)
.load()
)
DROP TABLE IF EXISTS snowflake_table;
CREATE TABLE snowflake_table
USING snowflake
OPTIONS (
dbtable '<table-name>',
sfUrl '<database-host-url>',
sfUser '<username>',
sfPassword '<password>',
sfDatabase '<database-name>',
sfSchema '<schema-name>',
sfWarehouse '<warehouse-name>'
);
SELECT * FROM snowflake_table;
val snowflake_table = spark.read
.format("snowflake")
.option("dbtable", table_name)
.option("sfUrl", database_host_url)
.option("sfUser", username)
.option("sfPassword", password)
.option("sfDatabase", database_name)
.option("sfSchema", schema_name)
.option("sfWarehouse", warehouse_name)
.load()
Notebook example: Snowflake Connector for Spark
The following notebooks provide simple examples of how to write data to and read data from Snowflake. See Using the Spark Connector for more details. In particular, see Setting Configuration Options for the Connector for all configuration options.
Tip
Avoid exposing your Snowflake username and password in notebooks by using Secrets, which are demonstrated in the notebooks.
Notebook example: Save model training results to Snowflake
The following notebook walks through best practices for using the Snowflake Connector for Spark. It writes data to Snowflake, uses Snowflake for some basic data manipulation, trains a machine learning model in Databricks, and writes the results back to Snowflake.
Frequently asked questions (FAQ)
Why don’t my Spark DataFrame columns appear in the same order in Snowflake?
The Snowflake Connector for Spark doesn’t respect the order of the columns in the table being written to; you must explicitly specify the mapping between DataFrame and Snowflake columns. To specify this mapping, use the columnmap parameter.
Why is INTEGER
data written to Snowflake read back as DECIMAL
?
Snowflake represents all INTEGER
types as NUMBER
, which can cause a change in data type when you write data to and read data from Snowflake. For example, INTEGER
data can be converted to DECIMAL
when writing to Snowflake, because INTEGER
and DECIMAL
are semantically equivalent in Snowflake (see Snowflake Numeric Data Types).