unionAllΒΆ

unionAll combines two SparkDataFrames by rows. This is equivalent to the UNION ALL operator in SQL. Note that this does not remove duplicate rows across the two SparkDataFrames.

Syntax:

  • unionAll(df1, df2)

Parameters:

  • df1: Any SparkDataFrame
  • df2: Any SparkDataFrame

Output:

  • SparkDataFrame
require(SparkR)

# Create 2 SparkDataFrames
smallDF <- createDataFrame(data.frame(name = c("Mouse", "Rabbit", "Bird"),
                                                count = c(3, 5, 4)))
bigDF <- createDataFrame(data.frame(name = c("Elephant", "Buffalo", "Bird"),
                                                count = c(1, 2, 4)))
head(smallDF)
head(bigDF)
# Combine the 2 DataFrames.
unionDF <- unionAll(smallDF, bigDF)

# Count number of rows. Since no dupe rows were removed, we should get 6 rows
count(unionDF)
head(unionDF)

To combine more than 2 SparkDataFrames, you can use rbind.