Skip to main content

cache

Persists the DataFrame with the default storage level (MEMORY_AND_DISK_DESER).

Syntax

cache()

Returns

DataFrame: Cached DataFrame.

Notes

The default storage level has changed to MEMORY_AND_DISK_DESER to match Scala in 3.0.

Cached data is shared across all Spark sessions on the cluster.

Examples

Python
df = spark.range(1)
df.cache()
# DataFrame[id: bigint]

df.explain()
# == Physical Plan ==
# InMemoryTableScan ...