Skip to main content

Read and write text files

The text format reads each line of a text file as a row in a DataFrame with a single value column of type StringType. Databricks users commonly use it for log parsing, ingesting raw data before further processing, or any workflow that requires line-by-line access to file content. Databricks supports reading and writing text files with Apache Spark, including write compression.

Prerequisites

Databricks does not require additional configuration to use text files. However, to stream text files, you need Auto Loader.

Options

Use the .option() and .options() methods of DataFrameReader and DataFrameWriter to configure text data sources. For a complete list of supported options, see DataFrameReader text options and DataFrameWriter text options.

Usage

The following examples use the Wanderbricks dataset to demonstrate reading and writing text files using the Spark DataFrame API and SQL.

Read text files using SQL

To query text files without registering a table, use read_files. Unity Catalog permissions on the external location apply automatically.

SQL
SELECT * FROM read_files(
'/Volumes/<catalog>/<schema>/<volume>/review_comments',
format => 'text'
)

Read and write text files

The text format requires a DataFrame with a single StringType column. The following examples write Wanderbricks review comments as a text file, then read them back.

Python
from pyspark.sql.functions import col

# Write wanderbricks review comments as a text file
df = spark.read.table("samples.wanderbricks.reviews").select(col("comment").alias("value"))
df.write.format("text").save("/Volumes/<catalog>/<schema>/<volume>/review_comments")

# Read a text file — each line becomes a row in the "value" column
df = spark.read.format("text").load("/Volumes/<catalog>/<schema>/<volume>/review_comments")
display(df)

Additional resources

  • Read and write CSV files: If your text data is delimited or tabular, CSV provides structured parsing with schema inference, header support, and configurable delimiters.