Skip to main content

xml (DataFrameReader)

Loads an XML file and returns the result as a DataFrame. If schema is not specified, this function reads the input once to determine the input schema.

Syntax

xml(path, schema=None, **options)

Parameters

Parameter

Type

Description

path

str, list, or RDD

One or more input paths, or an RDD of strings storing XML rows.

schema

StructType or str, optional

An optional input schema as a StructType object or a DDL-formatted string (for example, 'col0 INT, col1 DOUBLE').

Returns

DataFrame

Examples

Write a DataFrame into an XML file and read it back.

Python
import tempfile
with tempfile.TemporaryDirectory(prefix="xml") as d:
spark.createDataFrame(
[{"age": 100, "name": "Alice"}]
).write.mode("overwrite").option("rowTag", "person").format("xml").save(d)

spark.read.option("rowTag", "person").xml(d).show()
# +---+------------+
# |age| name|
# +---+------------+
# |100|Alice|
# +---+------------+