Skip to main content

sort_array

Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.

Syntax

Python
from pyspark.sql import functions as sf

sf.sort_array(col, asc=True)

Parameters

Parameter

Type

Description

col

pyspark.sql.Column or str

Name of the column or expression.

asc

bool, optional

Whether to sort in ascending or descending order. If asc is True (default), then the sorting is in ascending order. If False, then in descending order.

Returns

pyspark.sql.Column: Sorted array.

Examples

Example 1: Sorting an array in ascending order

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([([2, 1, None, 3],)], ['data'])
df.select(sf.sort_array(df.data)).show()
Output
+----------------------+
|sort_array(data, true)|
+----------------------+
| [NULL, 1, 2, 3]|
+----------------------+

Example 2: Sorting an array in descending order

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([([2, 1, None, 3],)], ['data'])
df.select(sf.sort_array(df.data, asc=False)).show()
Output
+-----------------------+
|sort_array(data, false)|
+-----------------------+
| [3, 2, 1, NULL]|
+-----------------------+

Example 3: Sorting an array with a single element

Python
import pyspark.sql.functions as sf
df = spark.createDataFrame([([1],)], ['data'])
df.select(sf.sort_array(df.data)).show()
Output
+----------------------+
|sort_array(data, true)|
+----------------------+
| [1]|
+----------------------+

Example 4: Sorting an empty array

Python
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, StringType, StructField, StructType
schema = StructType([StructField("data", ArrayType(StringType()), True)])
df = spark.createDataFrame([([],)], schema=schema)
df.select(sf.sort_array(df.data)).show()
Output
+----------------------+
|sort_array(data, true)|
+----------------------+
| []|
+----------------------+

Example 5: Sorting an array with null values

Python
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField
schema = StructType([StructField("data", ArrayType(IntegerType()), True)])
df = spark.createDataFrame([([None, None, None],)], schema=schema)
df.select(sf.sort_array(df.data)).show()
Output
+----------------------+
|sort_array(data, true)|
+----------------------+
| [NULL, NULL, NULL]|
+----------------------+