Skip to main content

array_position

Locates the position of the first occurrence of the given value in the given array. Returns null if either of the arguments are null. The position is not zero based, but 1 based index. Returns 0 if the given value could not be found in the array.

Syntax

Python
from pyspark.sql import functions as sf

sf.array_position(col, value)

Parameters

Parameter

Type

Description

col

pyspark.sql.Column or str

Target column to work on.

value

Any

Value or a Column expression to look for.

Returns

pyspark.sql.Column: position of the value in the given array if found and 0 otherwise.

Examples

Example 1: Finding the position of a string in an array of strings

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
Output
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 3|
+-----------------------+

Example 2: Finding the position of a string in an empty array

Python
from pyspark.sql import functions as sf
from pyspark.sql.types import ArrayType, StringType, StructField, StructType
schema = StructType([StructField("data", ArrayType(StringType()), True)])
df = spark.createDataFrame([([],)], schema=schema)
df.select(sf.array_position(df.data, "a")).show()
Output
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 0|
+-----------------------+

Example 3: Finding the position of an integer in an array of integers

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3],)], ['data'])
df.select(sf.array_position(df.data, 2)).show()
Output
+-----------------------+
|array_position(data, 2)|
+-----------------------+
| 2|
+-----------------------+

Example 4: Finding the position of a non-existing value in an array

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["c", "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "d")).show()
Output
+-----------------------+
|array_position(data, d)|
+-----------------------+
| 0|
+-----------------------+

Example 5: Finding the position of a value in an array with nulls

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([None, "b", "a"],)], ['data'])
df.select(sf.array_position(df.data, "a")).show()
Output
+-----------------------+
|array_position(data, a)|
+-----------------------+
| 3|
+-----------------------+

Example 6: Finding the position of a column's value in an array of integers

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([10, 20, 30], 20)], ['data', 'col'])
df.select(sf.array_position(df.data, df.col)).show()
Output
+-------------------------+
|array_position(data, col)|
+-------------------------+
| 2|
+-------------------------+