Skip to main content

unionByName

Returns a new DataFrame containing union of rows in this and another DataFrame.

Syntax

unionByName(other: "DataFrame", allowMissingColumns: bool = False)

Parameters

Parameter

Type

Description

other

DataFrame

Another DataFrame that needs to be combined.

allowMissingColumns

bool, optional, default False

Specify whether to allow missing columns.

Returns

DataFrame: A new DataFrame containing the combined rows with corresponding columns of the two given DataFrames.

Notes

This method performs a union operation on both input DataFrames, resolving columns by name (rather than position). When allowMissingColumns is True, missing columns will be filled with null.

Examples

Python
df1 = spark.createDataFrame([[1, 2, 3]], ["col0", "col1", "col2"])
df2 = spark.createDataFrame([[4, 5, 6]], ["col1", "col2", "col0"])
df1.unionByName(df2).show()
# +----+----+----+
# |col0|col1|col2|
# +----+----+----+
# | 1| 2| 3|
# | 6| 4| 5|
# +----+----+----+

df1 = spark.createDataFrame([[1, 2, 3]], ["col0", "col1", "col2"])
df2 = spark.createDataFrame([[4, 5, 6]], ["col1", "col2", "col3"])
df1.unionByName(df2, allowMissingColumns=True).show()
# +----+----+----+----+
# |col0|col1|col2|col3|
# +----+----+----+----+
# | 1| 2| 3|NULL|
# |NULL| 4| 5| 6|
# +----+----+----+----+