Snowflake is a cloud-based SQL data warehouse that focuses on great performance, zero-tuning, diversity of data sources, and security. This article explains how to read data from and write data to Snowflake using the Databricks Snowflake connector.
Databricks and Snowflake have partnered to bring a first-class connector experience for customers of both Databricks and Snowflake, saving you from having to import and load libraries into your clusters, and therefore preventing version conflicts and misconfiguration.
The Databricks Snowflake connector is available in Databricks Runtime 4.2 and above.
The following notebooks provide simple examples of how to write data to and read data from Snowflake. See Using the Connector in the Snowflake documentation for more details.
Avoid exposing your Snowflake username and password in notebooks by using the Secrets feature, which is demonstrated in the sample notebooks below.
Why don’t my Spark DataFrame columns appear in the same order in Snowflake?
The Spark - Snowflake connector doesn’t respect the order of the columns in the table being written to; you must explicitly specify the mapping between DataFrame and Snowflake columns. To specify this mapping, use the columnmap parameter.
INTEGER data written to Snowflake always read back as
Snowflake represents all
INTEGER types as
NUMBER, which can cause a change in data type when you write data to and read data from Snowflake. For example,
INTEGER data can be converted to
DECIMAL when writing to Snowflake, because
DECIMAL are semantically equivalent in Snowflake (see Snowflake Numeric Data Types).
Why are the fields in my Snowflake table schema always uppercase?
Snowflake uses uppercase fields by default, which means that the table schema is converted to uppercase.