concat
Collection function: Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns. Supports Spark Connect.
For the corresponding Databricks SQL function, see concat function.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.concat(*cols)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| Target column or columns to work on. |
Returns
pyspark.sql.Column: concatenated values. Type of the Column depends on input columns' type.
Examples
Example 1: Concatenating string columns
Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd','123')], ['s', 'd'])
df.select(dbf.concat(df.s, df.d)).show()
Output
+------------+
|concat(s, d)|
+------------+
| abcd123|
+------------+
Example 2: Concatenating array columns
Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c'])
df.select(dbf.concat(df.a, df.b, df.c)).show()
Output
+---------------+
|concat(a, b, c)|
+---------------+
|[1, 2, 3, 4, 5]|
| NULL|
+---------------+