Spark SQL provides two function features to meet a wide range of needs: built-in functions and user-defined functions (UDFs).

Built-in functions

This article presents the usages and descriptions of categories of frequently used built-in functions for aggregation, arrays and maps, dates and timestamps, and JSON data.

SQL user-defined functions

SQL user-defined functions (UDFs) are functions you can define yourself which can return scalar values or result sets.

See CREATE FUNCTION (SQL) for more information.

User-defined functions

UDFs allow you to define your own functions when the system’s built-in functions are not enough to perform the desired task. To use UDFs, you first define the function, then register the function with Spark, and finally call the registered function. A UDF can act on a single row or act on multiple rows at once. Spark SQL also supports integration of existing Hive implementations of UDFs, user defined aggregate functions (UDAF), and user defined table functions (UDTF).