Skip to main content

concat

Collection function: Concatenates multiple input columns together into a single column. The function works with strings, numeric, binary and compatible array columns. Supports Spark Connect.

For the corresponding Databricks SQL function, see concat function.

Syntax

Python
from pyspark.databricks.sql import functions as dbf

dbf.concat(*cols)

Parameters

Parameter

Type

Description

cols

pyspark.sql.Column or str

Target column or columns to work on.

Returns

pyspark.sql.Column: concatenated values. Type of the Column depends on input columns' type.

Examples

Example 1: Concatenating string columns

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd','123')], ['s', 'd'])
df.select(dbf.concat(df.s, df.d)).show()
Output
+------------+
|concat(s, d)|
+------------+
| abcd123|
+------------+

Example 2: Concatenating array columns

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([([1, 2], [3, 4], [5]), ([1, 2], None, [3])], ['a', 'b', 'c'])
df.select(dbf.concat(df.a, df.b, df.c)).show()
Output
+---------------+
|concat(a, b, c)|
+---------------+
|[1, 2, 3, 4, 5]|
| NULL|
+---------------+