Skip to main content

text (DataFrameReader)

Loads text files and returns a DataFrame whose schema starts with a string column named value, followed by partitioned columns if any are present. Text files must be encoded as UTF-8. By default, each line in the text file is a new row in the resulting DataFrame.

Syntax

text(paths, wholetext=False, lineSep=None, **options)

Parameters

Parameter

Type

Description

paths

str or list

One or more input paths.

wholetext

bool, optional

If True, read each file as a single row. Default is False.

lineSep

str, optional

The line separator to use. Default is '\n', '\r', or '\r\n'.

Returns

DataFrame

Examples

Write a DataFrame into a text file and read it back.

Python
import tempfile
with tempfile.TemporaryDirectory(prefix="text") as d:
df = spark.createDataFrame([("a",), ("b",), ("c",)], schema=["alphabets"])
df.write.mode("overwrite").format("text").save(d)

spark.read.schema(df.schema).text(d).sort("alphabets").show()
# +---------+
# |alphabets|
# +---------+
# | a|
# | b|
# | c|
# +---------+