Skip to main content

save

Saves the contents of the DataFrame to a data source. The data source is specified by format and a set of options. If format is not specified, the default data source configured by spark.sql.sources.default is used.

Syntax

save(path=None, format=None, mode=None, partitionBy=None, **options)

Parameters

Parameter

Type

Description

path

str, optional

The path in a Hadoop-supported file system.

format

str, optional

The format used to save.

mode

str, optional

The behavior when data already exists. Accepted values are 'append', 'overwrite', 'ignore', and 'error' or 'errorifexists' (default).

partitionBy

list, optional

Names of partitioning columns.

**options

dict

Additional string options.

Returns

None

Examples

Write a DataFrame into a JSON file and read it back.

Python
import tempfile
with tempfile.TemporaryDirectory(prefix="save") as d:
spark.createDataFrame(
[{"age": 100, "name": "Alice"}]
).write.mode("overwrite").format("json").save(d)

spark.read.format('json').load(d).show()
# +---+------------+
# |age| name|
# +---+------------+
# |100|Alice|
# +---+------------+