to_json

Converts a column containing a StructType, ArrayType, MapType or a VariantType into a JSON string. Throws an exception, in the case of an unsupported type.

Syntax

Python
from pyspark.sql import functions as sf

sf.to_json(col, options=None)

Parameters

Parameter	Type	Description
`col`	`pyspark.sql.Column` or str	Name of column containing a struct, an array, a map, or a variant object.
`options`	dict, optional	Options to control converting. Accepts the same options as the JSON datasource. Additionally the function supports the `pretty` option which enables pretty JSON generation.

Returns

pyspark.sql.Column: JSON object as string column.

Examples

Example 1: Converting a StructType column to JSON

Python
import pyspark.sql.functions as sf
from pyspark.sql import Row
data = [(1, Row(age=2, name='Alice'))]
df = spark.createDataFrame(data, ("key", "value"))
df.select(sf.to_json(df.value).alias("json")).show(truncate=False)

Output
+------------------------+
|json                    |
+------------------------+
|{"age":2,"name":"Alice"}|
+------------------------+

Example 2: Converting an ArrayType column to JSON

Python
import pyspark.sql.functions as sf
from pyspark.sql import Row
data = [(1, [Row(age=2, name='Alice'), Row(age=3, name='Bob')])]
df = spark.createDataFrame(data, ("key", "value"))
df.select(sf.to_json(df.value).alias("json")).show(truncate=False)

Output
+-------------------------------------------------+
|json                                             |
+-------------------------------------------------+
|[{"age":2,"name":"Alice"},{"age":3,"name":"Bob"}]|
+-------------------------------------------------+

Example 3: Converting a MapType column to JSON

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, {"name": "Alice"})], ("key", "value"))
df.select(sf.to_json(df.value).alias("json")).show(truncate=False)

Output
+----------------+
|json            |
+----------------+
|{"name":"Alice"}|
+----------------+

Example 4: Converting a VariantType column to JSON

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '{"name": "Alice"}')], ("key", "value"))
df.select(sf.to_json(sf.parse_json(df.value)).alias("json")).show(truncate=False)

Output
+----------------+
|json            |
+----------------+
|{"name":"Alice"}|
+----------------+

Example 5: Converting a nested MapType column to JSON

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, [{"name": "Alice"}, {"name": "Bob"}])], ("key", "value"))
df.select(sf.to_json(df.value).alias("json")).show(truncate=False)

Output
+---------------------------------+
|json                             |
+---------------------------------+
|[{"name":"Alice"},{"name":"Bob"}]|
+---------------------------------+

Example 6: Converting a simple ArrayType column to JSON

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, ["Alice", "Bob"])], ("key", "value"))
df.select(sf.to_json(df.value).alias("json")).show(truncate=False)

Output
+---------------+
|json           |
+---------------+
|["Alice","Bob"]|
+---------------+

Example 7: Converting to JSON with specified options

Python
import pyspark.sql.functions as sf
df = spark.sql("SELECT (DATE('2022-02-22'), 1) AS date")
json1 = sf.to_json(df.date)
json2 = sf.to_json(df.date, {"dateFormat": "yyyy/MM/dd"})
df.select("date", json1, json2).show(truncate=False)

Output
+---------------+------------------------------+------------------------------+
|date           |to_json(date)                 |to_json(date)                 |
+---------------+------------------------------+------------------------------+
|{2022-02-22, 1}|{"col1":"2022-02-22","col2":1}|{"col1":"2022/02/22","col2":1}|
+---------------+------------------------------+------------------------------+

Syntax​

Parameters​

Returns​

Examples​

Syntax

Parameters

Returns

Examples