Function invocation
Applies to: Databricks SQL Databricks Runtime
A function invocation executes a builtin function or a user-defined function after associating arguments to the function’s parameters.
Databricks supports positional parameter invocation as well as named parameter invocation.
Positional parameter invocation
Each argument is assigned to the matching parameter at the position it is specified.
This notation can be used by all functions unless it is explicitly documented that named parameter invocation is required.
If the function supports optional parameters, trailing parameters for which no arguments have been specified, are defaulted.
Named parameter invocation
Arguments are explicitly assigned to parameters using the parameter names published by the function.
This notation must be used for a select subset of built-in functions which allow numerous optional parameters, making positional parameter invocation impractical. These functions may allow a mixed invocation where a leading set of parameters are expected to be assigned by position and the trailing, optional set of parameters by name.
Named parameter invocation, including mixed invocation, can also be used for SQL UDF and Python UDF.
Syntax
function_name ( [ argExpr | table_argument | star_clause ] [, ...]
[ namedParameter => [ argExpr | table_argument ] [, ...] )
table_argument
{ TABLE ( { table_name | query } )
[ table_partition ]
[ table_order ]
table_partitioning
{ WITH SINGLE PARTITION |
{ PARTITION | DISTRIBUTE } BY { partition_expr | ( partition_expr [, ...] ) } }
table_ordering
{ { ORDER | SORT } BY { order_by_expr | ( order_by_expr [, ...] } }
Parameters
-
The name of the built-in or user defined function. When resolving an unqualified
function_name
Databricks will first consider a built-in or temporary function, and then a function in the current schema. -
Any expression which can be implicitly cast to the parameter it is associated with.
The function may impose further restriction on the argument such as mandating literals, constant expressions, or specific values.
-
A shorthand to name all the referenceable columns in the
FROM
clause, or a specific table reference’s columns or fields in theFROM
clause. table_argument
Databricks SQL Databricks Runtime 14.0 and aboveSpecifies an argument for a parameter that is a table.
TABLE
( table_name )Identifies a table to pass to the function by name.
TABLE
( query )Passes the result of
query
to the function.table-partitioning
Databricks SQL Databricks Runtime 14.1 and aboveOptionally specifies that the table argument is partitioned. If not specified the partitioning is determined by Databricks.
WITH SINGLE PARTITION
The table argument is not partitioned.
-
One or more expressions defining how to partition the table argument. Each expression can be composed of columns presents in the table argument, literals, parameters, variables, and deterministic functions.
table-ordering
Databricks SQL Databricks Runtime 14.1 and aboveOptionally specifies an order in which the result rows of each partition of the table argument are passed to the function.
By default, the order is undefined.
-
One or more expressions. Each expression can be composed of columns presents in the table argument, literals, parameters, variables, and deterministic functions.
-
-
Databricks SQL Databricks Runtime 14.0 and above
The unqualified name of a parameter to which the
argExpr
is assigned.Named parameter notation is supported for SQL UDF, Python UDF, and specific built-in functions.
Examples
-- The substr function has three parameter and expects arguents to be passed by position.
> SELECT substr('hello', 3, 2);
ll
-- The last parameter, the length, of substr is optional, when it is ommited it retrns the remainder of the string.
> SELECT substr('hello', 3);
llo
-- Use the star clause to turn a set of columns into an array.
> SELECT array(*) FROM VALUES (1, 2, 3) AS t(a, b, c);
[1, 2, 3]
-- The second parameter, start position, is not optional
> SELECT substr('hello');
Error: WRONG_NUM_ARGS
-- read_files() is a function that accepts numerous parameters, many of which depend on the data source
-- The first parameter is positional, after that use named parameter invocation
> SELECT * FROM read_files(
's3://bucket/path',
format => 'csv',
schema => 'id int, ts timestamp, event string');
-- cloud_files_state() is a function that expects a table name as an argument
> SELECT path FROM cloud_files_state(TABLE(mytable));
/some/input/path
/other/input/path
-- Invoking a SQL UDF using named parameter invocation
> CREATE TEMPORARY FUNCTION increase(base INT, factor FLOAT DEFAULT 1) RETURNS INT RETURN base * factor;
-- Out of order assignment
> SELECT increase(factor => 1.2, base => 100);
120
-- Mixed invocation
> SELECT increase(100, factor => 1.3);
130
-- Using default
> SELECT increase(base => 100);
100
-- No position invocation after named invocation is allowed
> SELECT increase(base => 100, 1.4);
Error: UNEXPECTED_POSITIONAL_ARGUMENT