Skip to main content

csv (DataFrameReader)

Loads a CSV file and returns the result as a DataFrame. If inferSchema is enabled, this function reads the input once to determine the schema. To avoid this, either disable inferSchema or specify the schema explicitly using schema.

Syntax

csv(path, schema=None, **options)

Parameters

Parameter

Type

Description

path

str or list

One or more input paths, or an RDD of strings storing CSV rows.

schema

StructType or str, optional

An optional input schema as a StructType object or a DDL-formatted string (for example, 'col0 INT, col1 DOUBLE').

Returns

DataFrame

Examples

Write a DataFrame into a CSV file and read it back.

Python
import tempfile
with tempfile.TemporaryDirectory(prefix="csv") as d:
df = spark.createDataFrame([{"age": 100, "name": "Alice"}])
df.write.mode("overwrite").format("csv").save(d)

spark.read.csv(d, schema=df.schema, nullValue="Alice").show()
# +---+----+
# |age|name|
# +---+----+
# |100|NULL|
# +---+----+