Skip to main content

exceptAll

Return a new DataFrame containing rows in this DataFrame but not in another DataFrame while preserving duplicates.

Syntax

exceptAll(other: "DataFrame")

Parameters

Parameter

Type

Description

other

DataFrame

The other DataFrame to compare to.

Returns

DataFrame

Notes

This is equivalent to EXCEPT ALL in SQL. As standard in SQL, this function resolves columns by position (not by name).

Examples

Python
df1 = spark.createDataFrame(
[("a", 1), ("a", 1), ("a", 1), ("a", 2), ("b", 3), ("c", 4)], ["C1", "C2"])
df2 = spark.createDataFrame([("a", 1), ("b", 3)], ["C1", "C2"])
df1.exceptAll(df2).show()
# +---+---+
# | C1| C2|
# +---+---+
# | a| 1|
# | a| 1|
# | a| 2|
# | c| 4|
# +---+---+