createDataFrameΒΆ

createDataFrame creates SparkDataFrames from local R data frames.

SparkR’s distributed SparkDataFrame implementation supports operations like selection, filtering, aggregation etc. on large datasets.

Syntax:

  • createDataFrame(localdf)

Parameters:

  • localdf: local R data frame

Output:

  • SparkDataFrame
require(SparkR)

# Create SparkDataFrame using the faithful dataset from R
df <- createDataFrame(faithful)

# Displays the content of the SparkDataFrame to stdout
head(df)
# Create a local R data frame
localdf <- data.frame(customer = c("James", "Peter", "Jane", "James"),
                      amount = c(5, 5, 6, 5))
str(localdf)
# Convert to SparkDataFrame
sparkdf <- createDataFrame(localdf)
str(sparkdf)