arrays_overlap
Returns a boolean column indicating if the input arrays have common non-null elements. Returns true if they do, null if the arrays do not contain any common elements but are not empty and at least one of them contains a null element, and false otherwise.
Syntax
from pyspark.sql import functions as sf
sf.arrays_overlap(a1, a2)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| The name of the column that contains the first array. |
|
| The name of the column that contains the second array. |
Returns
pyspark.sql.Column: A new Column of Boolean type, where each value indicates whether the corresponding arrays from the input columns contain any common elements.
Examples
Example 1: Basic usage of arrays_overlap function.
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b"], ["b", "c"]), (["a"], ["b", "c"])], ['x', 'y'])
df.select(sf.arrays_overlap(df.x, df.y)).show()
+--------------------+
|arrays_overlap(x, y)|
+--------------------+
| true|
| false|
+--------------------+
Example 2: Usage of arrays_overlap function with arrays containing null elements.
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", None], ["b", None]), (["a"], ["b", "c"])], ['x', 'y'])
df.select(sf.arrays_overlap(df.x, df.y)).show()
+--------------------+
|arrays_overlap(x, y)|
+--------------------+
| NULL|
| false|
+--------------------+
Example 3: Usage of arrays_overlap function with arrays that are null.
from pyspark.sql import functions as sf
df = spark.createDataFrame([(None, ["b", "c"]), (["a"], None)], ['x', 'y'])
df.select(sf.arrays_overlap(df.x, df.y)).show()
+--------------------+
|arrays_overlap(x, y)|
+--------------------+
| NULL|
| NULL|
+--------------------+
Example 4: Usage of arrays_overlap on arrays with identical elements.
from pyspark.sql import functions as sf
df = spark.createDataFrame([(["a", "b"], ["a", "b"]), (["a"], ["a"])], ['x', 'y'])
df.select(sf.arrays_overlap(df.x, df.y)).show()
+--------------------+
|arrays_overlap(x, y)|
+--------------------+
| true|
| true|
+--------------------+