CREATE FUNCTION (SQL)

Preview

This feature is in Public Preview.

Note

Available in Databricks Runtime 9.1 and above.

Creates a SQL scalar or table function that takes a set of arguments and returns a scalar value or a set of rows.

Syntax

CREATE [OR REPLACE] [TEMPORARY] FUNCTION [IF NOT EXISTS]
    [database_name.] function_name ([function_parameter[,...]])
    RETURNS {data_type | TABLE (function_parameter[,...])
    [characteristic...]
    RETURN { expression | query }

function_parameter:
    param_name data_type [COMMENT parameter_comment]

characteristic:
    LANGUAGE SQL
  | [NOT] DETERMINISTIC
  | COMMENT function_comment
  | [CONTAINS SQL | READS SQL DATA]

Parameters

  • OR REPLACE

    If specified, the function with the same name and signature (number of parameters and parameter types) is replaced. You cannot replace an existing function with a different signature. This is mainly useful to update the function body and the return type of the function. You cannot specify this parameter with IF NOT EXISTS.

  • TEMPORARY

    The scope of the function being created. When you specify TEMPORARY, the created function is valid and visible in the current session. No persistent entry is made in the catalog.

  • IF NOT EXISTS

    If specified, creates the function only when it does not exist. The creation of the function succeeds (no error is thrown) if the specified function already exists in the system. You cannot specify this parameter with OR REPLACE.

  • [database_name.] function_name

    The name of the function to be created. For a permanent function, you can optionally qualify the function name with a database name.

  • function_parameter

    The optional function parameter list. A function parameter consists of a name, a valid data type, and an optional comment. If the function does not take any parameter, use an empty parameter list ().

  • RETURNS data_type

    The return data type of the scalar function.

  • RETURNS TABLE (function_parameter[,…])

    The signature of the result of the table function.

  • RETURN { expression | query }

    The body of the function. For a scalar function, it can either be a query or an expression. For a table function, it can only be a query. The expression cannot contain:

    • Aggregate and window functions
    • Row producing functions such as explode

Optional characteristic clauses

All characteristic clauses are optional. You can specify any number of them in any order, but you can specify each clause only once.

  • LANGUAGE SQL

    The language of the function. SQL is the only supported language.

  • [NOT] DETERMINISTIC

    Whether the function is deterministic. A function is deterministic when it returns only one result for a given set of parameters.

  • COMMENT function_comment

    A comment for the function.

  • CONTAINS SQL | READS SQL DATA

    Whether a function reads data directly or indirectly from a table or a view. When the function reads SQL data, ​you cannot specify CONTAINS SQL. If you don’t specify either clause, the property is derived from the function body.

Examples

Create and use a SQL scalar function

CREATE VIEW t(c1, c2) AS VALUES (0, 1), (1, 2);
-- Create a temporary function with no parameter.
CREATE TEMPORARY FUNCTION hello() RETURNS STRING RETURN 'Hello World!';

SELECT hello();
+------------+
|hello()     |
+------------+
|Hello World!|
+------------+

-- Create a permanent function with parameters.
CREATE FUNCTION area(x DOUBLE, y DOUBLE) RETURNS DOUBLE RETURN x * y;

-- Use a SQL function in the SELECT clause of a query.
SELECT area(c1, c2) AS area FROM t;
+----+
|area|
+----+
|0.0 |
|2.0 |
+----+

-- Use a SQL function in the WHERE clause of a query.
SELECT * FROM t WHERE area(c1, c2) > 0;
+---+---+
|c1 |c2 |
+---+---+
|1  |2  |
+---+---+

-- Compose SQL functions.
CREATE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN area(x, x);

SELECT c1, square(c1) AS square FROM t;
+---+------+
|c1 |square|
+---+------+
|0  |0.0   |
|1  |1.0   |
+---+------+

-- Create a non-deterministic function.
CREATE FUNCTION dice(n INT) RETURNS INT
NOT DETERMINISTIC
COMMENT 'a n-sided dice'
RETURN FLOOR((RAND() * n) + 1);

SELECT dice(6) AS dice_6;
+------+
|dice_6|
+------+
|3     |
+------+

-- Create a SQL function with a scalar subquery.
CREATE VIEW scores(player, score) AS VALUES (0, 1), (0, 2), (1, 2), (1, 5);

CREATE FUNCTION avg_score(p INT) RETURNS FLOAT
COMMENT 'get an average score of the player'
RETURN SELECT AVG(score) FROM scores WHERE player = p;

SELECT c1, avg_score(c1) AS avg_score FROM t;
+---+---------+
|c1 |avg_score|
+---+---------+
|0  |1.5      |
|1  |3.5      |
+---+---------+

Create and use a SQL table function

-- Create an employee table.
CREATE TABLE employee (id INT, name STRING, deptno INT);
INSERT INTO employee VALUES
    (101, 'John', 1),
    (102, 'Lisa', 2),
    (103, 'Paul', 3),
    (104, 'Evan', 4),
    (105, 'Chloe', 5),
    (106, 'Amy', 6);

-- Create a department table.
CREATE TABLE department (deptno INT, deptname STRING);
INSERT INTO department VALUES
    (1, 'Marketing'),
    (2, 'Sales'),
    (3, 'Engineering');

-- Create a SQL table function to get all employees with a given department number.
CREATE FUNCTION getemps(deptno INT)
RETURNS TABLE(id INT, name STRING)
COMMENT 'Get all employees with the department number'
RETURN SELECT id, name FROM employee e WHERE e.deptno = getemps.deptno

-- Use a SQL table function.
SELECT * FROM getemps(1);
+---+----+
|id |name|
+---+----+
|101|John|
+---+----+

-- Use a SQL table function with LATERAL.
SELECT getemps.* FROM department d, LATERAL getemps(d.deptno);
+---+----+
|id |name|
+---+----+
|101|John|
|102|Lisa|
|103|Paul|
+---+----+

Replace a SQL function

-- Replace a SQL scalar function.
CREATE OR REPLACE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN x * x;

-- Replace a SQL table function.
CREATE OR REPLACE FUNCTION getemps(deptno INT)
RETURNS TABLE (name STRING)
RETURN SELECT name FROM employee e WHERE e.deptno = getemps.deptno;

Note

You cannot replace an existing function with a different signature.

Describe a SQL function

DESCRIBE FUNCTION hello();
+----------------+
|function_desc   |
+----------------+
|Function: hello |
|Type:     SCALAR|
|Input:    ()    |
|Returns:  STRING|
+----------------+

DESCRIBE FUNCTION area;
+----------------------+
|function_desc         |
+----------------------+
|Function: default.area|
|Type:     SCALAR      |
|Input:    x DOUBLE    |
|          y DOUBLE    |
|Returns:  DOUBLE      |
+----------------------+

DESCRIBE FUNCTION dice;
+----------------------+
|function_desc         |
+----------------------+
|Function: default.dice|
|Type:     SCALAR      |
|Input:    n INT       |
|Returns:  INT         |
+----------------------+

DESCRIBE FUNCTION EXTENDED dice;
+-------------------------------------------+
|function_desc                              |
+-------------------------------------------+
|Function:      default.dice                |
|Type:          SCALAR                      |
|Input:         n INT                       |
|Returns:       INT                         |
|Comment:       a n-sided dice              |
|Deterministic: false                       |
|Owner:         user                        |
|Create Time:   Fri Apr 16 10:00:00 PDT 2021|
|Body:          FLOOR((RAND() * n) + 1)     |
+-------------------------------------------+

-- Describe a SQL table function.
DESCRIBE FUNCTION getemps;
+-------------------------+
|function_desc            |
+-------------------------+
|Function: default.getemps|
|Type:     TABLE          |
|Input:    deptno INT     |
|Returns:  id   INT       |
|          name STRING    |
+-------------------------+