from pyspark.sql.functions import *
from pyspark.sql.types import *
# Convenience function for turning JSON strings into DataFrames.
def jsonToDataFrame(json, schema=None):
# SparkSessions are available with Spark 2.0+
reader = spark.read
if schema:
reader.schema(schema)
return reader.json(sc.parallelize([json]))
Transforming Complex Data Types in Spark SQL
In this notebook we're going to go through some data transformation examples using Spark SQL. Spark SQL supports many built-in transformation functions in the module
pyspark.sql.functions
therefore we will start off by importing that.