Spark Configuration

Spark configuration properties

To fine tune Spark jobs, you can provide custom Spark configuration properties at the bottom of the cluster configuration page:

../../_images/spark-config-aws.png

When you configure a cluster using the Clusters API, set Spark properties in the spark_conf field in the Create cluster request or Edit cluster request.

To set Spark properties for all clusters, create a global init script:

%scala
dbutils.fs.put("dbfs:/databricks/init/set_spark_params.sh","""
#!/bin/bash
sudo echo "spark.sql.sources.partitionOverwriteMode DYNAMIC" >> /databricks/spark/conf/spark-defaults.conf
""", true)

Environment variables

You can set environment variables that you can access from scripts running on a cluster. Set environment variables in the spark_env_vars field in the Create cluster request or Edit cluster request.

../../_images/environment-variables.png

Note