Higher-order functions
Databricks provides dedicated primitives for manipulating arrays in Apache Spark SQL. These primitives make working with arrays easier and more concise and don't require large amounts of boilerplate code. The primitives revolve around two functional programming constructs: higher-order functions and anonymous (lambda) functions. These work together to allow you to define functions that manipulate arrays in SQL. A higher-order function takes an array, implements how that array is processed, and dictates the computation result. It delegates to a lambda function how to process each item in the array.
Introduction to higher-order functions notebook
Higher-order functions tutorial Python notebook
Apache Spark built-in functions
Apache Spark has built-in functions for manipulating complex types, such as array types, including higher-order functions.
The following notebook illustrates Apache Spark built-in functions.