Skip to main content

from_json

Parses a column containing a JSON string into a MapType with StringType as keys type, StructType or ArrayType with the specified schema. Returns null, in the case of an unparsable string.

Syntax

Python
from pyspark.sql import functions as sf

sf.from_json(col, schema, options=None)

Parameters

Parameter

Type

Description

col

pyspark.sql.Column or str

A column or column name in JSON format.

schema

DataType or str

A StructType, ArrayType of StructType or Python string literal with a DDL-formatted string to use when parsing the json column.

options

dict, optional

Options to control parsing. Accepts the same options as the json datasource.

Returns

pyspark.sql.Column: a new column of complex type from given JSON object.

Examples

Example 1: Parsing JSON with a specified schema

Python
import pyspark.sql.functions as sf
from pyspark.sql.types import StructType, StructField, IntegerType
schema = StructType([StructField("a", IntegerType())])
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()
Output
+----+
|json|
+----+
| {1}|
+----+

Example 2: Parsing JSON with a DDL-formatted string

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, "a INT").alias("json")).show()
Output
+----+
|json|
+----+
| {1}|
+----+

Example 3: Parsing JSON into a MapType

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value"))
df.select(sf.from_json(df.value, "MAP<STRING,INT>").alias("json")).show()
Output
+--------+
| json|
+--------+
|{a -> 1}|
+--------+

Example 4: Parsing JSON into an ArrayType of StructType

Python
import pyspark.sql.functions as sf
from pyspark.sql.types import ArrayType, StructType, StructField, IntegerType
schema = ArrayType(StructType([StructField("a", IntegerType())]))
df = spark.createDataFrame([(1, '''[{"a": 1}]''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()
Output
+-----+
| json|
+-----+
|[{1}]|
+-----+

Example 5: Parsing JSON into an ArrayType

Python
import pyspark.sql.functions as sf
from pyspark.sql.types import ArrayType, IntegerType
schema = ArrayType(IntegerType())
df = spark.createDataFrame([(1, '''[1, 2, 3]''')], ("key", "value"))
df.select(sf.from_json(df.value, schema).alias("json")).show()
Output
+---------+
| json|
+---------+
|[1, 2, 3]|
+---------+

Example 6: Parsing JSON with specified options

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([(1, '''{a:123}'''), (2, '''{"a":456}''')], ("key", "value"))
parsed1 = sf.from_json(df.value, "a INT")
parsed2 = sf.from_json(df.value, "a INT", {"allowUnquotedFieldNames": "true"})
df.select("value", parsed1, parsed2).show()
Output
+---------+----------------+----------------+
| value|from_json(value)|from_json(value)|
+---------+----------------+----------------+
| {a:123}| {NULL}| {123}|
|{"a":456}| {456}| {456}|
+---------+----------------+----------------+