Skip to main content

array_repeat

Creates an array containing a column repeated count times.

Syntax

Python
from pyspark.sql import functions as sf

sf.array_repeat(col, count)

Parameters

Parameter

Type

Description

col

pyspark.sql.Column or str

The name of the column or an expression that represents the element to be repeated.

count

pyspark.sql.Column, str, or int

The name of the column, an expression, or an integer that represents the number of times to repeat the element.

Returns

pyspark.sql.Column: A new column that contains an array of repeated elements.

Examples

Example 1: Usage with string

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([('ab',)], ['data'])
df.select(sf.array_repeat(df.data, 3)).show()
Output
+---------------------+
|array_repeat(data, 3)|
+---------------------+
| [ab, ab, ab]|
+---------------------+

Example 2: Usage with integer

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([(3,)], ['data'])
df.select(sf.array_repeat(df.data, 2)).show()
Output
+---------------------+
|array_repeat(data, 2)|
+---------------------+
| [3, 3]|
+---------------------+

Example 3: Usage with array

Python
from pyspark.sql import functions as sf
df = spark.createDataFrame([(['apple', 'banana'],)], ['data'])
df.select(sf.array_repeat(df.data, 2)).show(truncate=False)
Output
+----------------------------------+
|array_repeat(data, 2) |
+----------------------------------+
|[[apple, banana], [apple, banana]]|
+----------------------------------+

Example 4: Usage with null

Python
from pyspark.sql import functions as sf
from pyspark.sql.types import IntegerType, StructType, StructField
schema = StructType([
StructField("data", IntegerType(), True)
])
df = spark.createDataFrame([(None, )], schema=schema)
df.select(sf.array_repeat(df.data, 3)).show()
Output
+---------------------+
|array_repeat(data, 3)|
+---------------------+
| [NULL, NULL, NULL]|
+---------------------+