explode
Returns a new row for each element in the given array or map. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise.
note
Only one explode is allowed per SELECT clause.
Syntax
Python
from pyspark.sql import functions as sf
sf.explode(col)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| Target column to work on. |
Returns
pyspark.sql.Column: One row per array item or map key value.
Examples
Example 1: Exploding an array column
Python
from pyspark.sql import functions as sf
df = spark.sql('SELECT * FROM VALUES (1,ARRAY(1,2,3,NULL)), (2,ARRAY()), (3,NULL) AS t(i,a)')
df.show()
Output
+---+---------------+
| i| a|
+---+---------------+
| 1|[1, 2, 3, NULL]|
| 2| []|
| 3| NULL|
+---+---------------+
Python
df.select('*', sf.explode('a')).show()
Output
+---+---------------+----+
| i| a| col|
+---+---------------+----+
| 1|[1, 2, 3, NULL]| 1|
| 1|[1, 2, 3, NULL]| 2|
| 1|[1, 2, 3, NULL]| 3|
| 1|[1, 2, 3, NULL]|NULL|
+---+---------------+----+
Example 2: Exploding a map column
Python
from pyspark.sql import functions as sf
df = spark.sql('SELECT * FROM VALUES (1,MAP(1,2,3,4,5,NULL)), (2,MAP()), (3,NULL) AS t(i,m)')
df.show(truncate=False)
Output
+---+---------------------------+
|i |m |
+---+---------------------------+
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|
|2 |{} |
|3 |NULL |
+---+---------------------------+
Python
df.select('*', sf.explode('m')).show(truncate=False)
Output
+---+---------------------------+---+-----+
|i |m |key|value|
+---+---------------------------+---+-----+
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|1 |2 |
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|3 |4 |
|1 |{1 -> 2, 3 -> 4, 5 -> NULL}|5 |NULL |
+---+---------------------------+---+-----+
Example 3: Exploding multiple array columns
Python
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(1,2) AS a1, ARRAY(3,4,5) AS a2')
df.select(
'*', sf.explode('a1').alias('v1')
).select('*', sf.explode('a2').alias('v2')).show()
Output
+------+---------+---+---+
| a1| a2| v1| v2|
+------+---------+---+---+
|[1, 2]|[3, 4, 5]| 1| 3|
|[1, 2]|[3, 4, 5]| 1| 4|
|[1, 2]|[3, 4, 5]| 1| 5|
|[1, 2]|[3, 4, 5]| 2| 3|
|[1, 2]|[3, 4, 5]| 2| 4|
|[1, 2]|[3, 4, 5]| 2| 5|
+------+---------+---+---+
Example 4: Exploding an array of struct column
Python
import pyspark.sql.functions as sf
df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a')
df.select(sf.explode('a').alias("s")).select("s.*").show()
Output
+---+---+
| a| b|
+---+---+
| 1| 2|
| 3| 4|
+---+---+