slice

Returns a new array column by slicing the input array column from a start index to a specific length. The indices start at 1, and can be negative to index from the end of the array. The length specifies the number of elements in the resulting array.

Syntax

Python
from pyspark.sql import functions as sf

sf.slice(x, start, length)

Parameters

Parameter	Type	Description
`x`	`pyspark.sql.Column` or str	Input array column or column name to be sliced.
`start`	`pyspark.sql.Column`, str, or int	The start index for the slice operation. If negative, starts the index from the end of the array.
`length`	`pyspark.sql.Column`, str, or int	The length of the slice, representing number of elements in the resulting array.

Returns

pyspark.sql.Column: A new Column object of Array type, where each value is a slice of the corresponding list from the input column.

Examples

Example 1: Basic usage of the slice function.

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3],), ([4, 5],)], ['x'])
df.select(sf.slice(df.x, 2, 2)).show()

Output
+--------------+
|slice(x, 2, 2)|
+--------------+
|        [2, 3]|
|           [5]|
+--------------+

Example 2: Slicing with negative start index.

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3],), ([4, 5],)], ['x'])
df.select(sf.slice(df.x, -1, 1)).show()

Output
+---------------+
|slice(x, -1, 1)|
+---------------+
|            [3]|
|            [5]|
+---------------+

Example 3: Slice function with column inputs for start and length.

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([([1, 2, 3], 2, 2), ([4, 5], 1, 3)], ['x', 'start', 'length'])
df.select(sf.slice(df.x, df.start, df.length)).show()

Output
+-----------------------+
|slice(x, start, length)|
+-----------------------+
|                 [2, 3]|
|                 [4, 5]|
+-----------------------+

Syntax​

Parameters​

Returns​

Examples​

Syntax

Parameters

Returns

Examples