saveAsParquetFileΒΆ

saveAsParquetFile saves the contents of a SparkDataFrame as a Parquet file, preserving the schema.

Syntax:

  • saveAsParquetFile(df, “path”)

Parameters:

  • df: Any SparkDataFrame
  • path: String, file path to save to

Output:

  • Parquet File
require(SparkR)

# Create SparkDataFrame using the faithful dataset from R
df <- createDataFrame(faithful)

# Displays the content of the SparkDataFrame to stdout
head(df)
# Save df as Parquet File
saveAsParquetFile(df, "/tmp/temp.parquet")
%fs ls /tmp/temp.parquet
# Parquet File can be read using parquetFile() or read.df()
parquetDF <- parquetFile("/tmp/temp.parquet")
head(parquetDF)
# Print schema of parquetDF
printSchema(parquetDF)

Remove temp.parquet:

%fs rm -r /tmp/temp.parquet