get_json_object
Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.
Syntax
Python
from pyspark.sql import functions as sf
sf.get_json_object(col, path)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| String column in json format. |
| str | Path to the json object to extract. |
Returns
pyspark.sql.Column: string representation of given JSON object value.
Examples
Example 1: Extract a json object from json string
Python
from pyspark.sql import functions as sf
data = [("1", '''{"f1": "value1", "f2": "value2"}'''), ("2", '''{"f1": "value12"}''')]
df = spark.createDataFrame(data, ("key", "jstring"))
df.select(df.key,
sf.get_json_object(df.jstring, '$.f1').alias("c0"),
sf.get_json_object(df.jstring, '$.f2').alias("c1")
).show()
Output
+---+-------+------+
|key| c0| c1|
+---+-------+------+
| 1| value1|value2|
| 2|value12| NULL|
+---+-------+------+
Example 2: Extract a json object from json array
Python
from pyspark.sql import functions as sf
data = [
("1", '''[{"f1": "value1"},{"f1": "value2"}]'''),
("2", '''[{"f1": "value12"},{"f2": "value13"}]''')
]
df = spark.createDataFrame(data, ("key", "jarray"))
df.select(df.key,
sf.get_json_object(df.jarray, '$[0].f1').alias("c0"),
sf.get_json_object(df.jarray, '$[1].f2').alias("c1")
).show()
Output
+---+-------+-------+
|key| c0| c1|
+---+-------+-------+
| 1| value1| NULL|
| 2|value12|value13|
+---+-------+-------+
Python
df.select(df.key,
sf.get_json_object(df.jarray, '$[*].f1').alias("c0"),
sf.get_json_object(df.jarray, '$[*].f2').alias("c1")
).show()
Output
+---+-------------------+---------+
|key| c0| c1|
+---+-------------------+---------+
| 1|["value1","value2"]| NULL|
| 2| "value12"|"value13"|
+---+-------------------+---------+