Databricks SQL lambda functions

A parameterized expression that can be passed to a function to control its behavior.

For example, array_sort function (Databricks SQL) accepts a lambda function as an argument to define a custom sort order.

Syntax

{ param -> expr |
  (param1 [, ...] ) -> expr }

Parameters

  • paramN: An identifier used by the parent function to pass arguments for the lambda function.
  • expr: Any simple expression referencing paramN, which does not contain a subquery.

Returns

The result type is defined by the result type of expr.

If there is more than one paramN, the parameter names must be unique. The types of the parameters are set by the invoking function. The expression must be valid for these types and the result type must match the defined expectations of the invoking functions.

Examples

The array_sort function (Databricks SQL) function expects a lambda function with two parameters. The parameter types will be the type of the elements of the array to be sorted. The expression is expected to return an INTEGER where -1 means param1 < param2, 0 means param1 = param2, and 1 otherwise.

To sort an ARRAY of STRING in a right to left lexical order, you can use the following lambda function.

(p1, p2) -> CASE WHEN p1 = p2 THEN 0
                 WHEN reverse(p1) < reverse(p2) THEN -1
                 ELSE 1 END

Lambda functions are defined and used ad hoc. So the function definition is the argument:

> SELECT array_sort(array('Hello', 'World'),
  (p1, p2) -> CASE WHEN p1 = p2 THEN 0
              WHEN reverse(p1) < reverse(p2) THEN -1
              ELSE 1 END);
[World, Hello]