table

table converts a Spark SQL table into a SparkR DataFrame.

Note:

To create contingency tables, use the crosstab function.

Syntax:

  • table(sqlContext, tableName)

Parameters:

  • sqlContext: SQLContext. This has already been created for you as sqlContext
  • tableName: String, name of Spark SQL table

Output:

  • SparkR DataFrame

Guide <http://spark.apache.org/docs/latest/sparkr.html>__ Let’s create a temporary Spark SQL table using a CSV file.

-- mode "FAILFAST" will abort file parsing with a RuntimeException if any malformed lines are encountered

CREATE TEMPORARY TABLE temp_diamonds
USING com.databricks.spark.csv
OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true", mode "FAILFAST")
SELECT * FROM temp_diamonds

We now have a temporary table called temp_diamonds. We can use the SparkR table function to convert it into a DataFrame.

diamondsDF <- table(sqlContext, "temp_diamonds")
head(diamondsDF)
# table() creates a SparkR DataFrame
str(diamondsDF)

Note that we can also create SparkR DataFrames from Spark SQL tables with the sql function, using SQL queries.

diamondsSQL <- sql(sqlContext, "SELECT * FROM temp_diamonds")
head(diamondsSQL)