Skip to main content

zip_with

Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function. Supports Spark Connect.

For the corresponding Databricks SQL function, see zip_with function.

Syntax

Python
from pyspark.databricks.sql import functions as dbf

dbf.zip_with(left=<left>, right=<right>, f=<f>)

Parameters

Parameter

Type

Description

left

pyspark.sql.Column or str

Name of the first column or expression.

right

pyspark.sql.Column or str

Name of the second column or expression.

f

function

A binary function.

Returns

pyspark.sql.Column: array of calculated values derived by applying given function to each pair of arguments.

Examples

Example 1: Merging two arrays with a simple function

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([(1, [1, 3, 5, 8], [0, 2, 4, 6])], ("id", "xs", "ys"))
df.select(dbf.zip_with("xs", "ys", lambda x, y: x ** y).alias("powers")).show(truncate=False)
Output
+---------------------------+
|powers |
+---------------------------+
|[1.0, 9.0, 625.0, 262144.0]|
+---------------------------+

Example 2: Merging arrays of different lengths

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([(1, ["foo", "bar"], [1, 2, 3])], ("id", "xs", "ys"))
df.select(dbf.zip_with("xs", "ys", lambda x, y: dbf.concat_ws("_", x, y)).alias("xs_ys")).show()
Output
+-----------------+
| xs_ys|
+-----------------+
|[foo_1, bar_2, 3]|
+-----------------+