SQL expression

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

An expression is a formula that computes a result based on literals or references to columns, fields, or variables, using functions or operators.

Syntax

{ literal |
  named_parameter_marker |
  unnamed_parameter_marker |
  column_reference |
  field_reference |
  parameter_reference |
  CAST expression |
  CASE expression |
  expr operator expr |
  operator expr |
  expr [ expr ] |
  function_invocation |
  ( expr ) |
  ( expr, expr [, ... ] ) |
  scalar_subquery }

scalar_subquery
  ( query )

The brackets in expr [ expr ] are actual brackets and do not indicate optional syntax.

Parameters

  • literal

    A literal of a type described in Data types.

  • named_parameter_marker

    A named and typed placeholder for a value provided by the API submitting the SQL statement.

  • unnamed_parameter_marker

    An unnamed and typed placeholder for a value provided by the API submitting the SQL statement.

  • column_reference

    A reference to a column in a table or column alias.

  • field_reference

    A reference to a field in a STRUCT type.

  • parameter_reference

    A reference to a parameter of a SQL user defined function from with the body of the function. The reference may use the unqualified name of the parameter or qualify the name with the function name. Parameters constitute the outermost scope when resolving identifiers.

  • CAST expression

    An expression casting the argument to a different type.

  • CASE expression

    An expression allowing for conditional evaluation.

  • expr

    An expression itself which is combined with an operator, or which is an argument to a function.

  • operator

    A unary or binary operator.

  • expr [ expr ]

    A reference to an array element or a map key.

  • function_invocation

    An invocation of a built-in or user defined function.

    See function_invocation for details.

  • ( expr )

    Enforced precedence that overrides operator precedence.

  • ( expr, expr [, … ] )

    Creates a struct of two or more fields. This notation is synonymous to the struct function.

  • scalar_subquery:

    • ( query )

      An expression based on a query that must return a single column and at most one row.

The pages for each function and operator describe the data types their parameters expect. Databricks performs implicit casting to expected types using SQL data type rules. If an operator or function is invalid for the provided argument, Databricks raises an error.

See Column, field, parameter, and variable resolution for more information on name resolution.

Constant expression

An expression that is based only on literals or deterministic functions with no arguments. Databricks can execute the expression and use the resulting constant where ordinarily literals are required.

Boolean expression

An expression with a result type of BOOLEAN. A Boolean expression is also sometimes referred to as a condition or a predicate.

Scalar subquery

An expression of the form ( query ). The query must return a table that has one column and at most one row.

If the query returns no row, the result is NULL. If the query returns more than one row, Databricks returns an error. Otherwise, the result is the value returned by the query.

Simple expression

An expression that does not contain a query, such as a scalar subquery or an EXISTS predicate.

Examples

> SELECT 1;
  1

> SELECT (SELECT 1) + 1;
  2

> SELECT 1 + 1;
  2

> SELECT 2 * (1 + 2);
  6

> SELECT 2 * 1 + 2;
  4

> SELECT substr('Spark', 1, 2);
  Sp

> SELECT c1 + c2 FROM VALUES(1, 2) AS t(c1, c2);
  3

> SELECT a[1] FROM VALUES(array(10, 20)) AS T(a);
  20

> SELECT true;
  true

> SELECT (c1, (c2, c3)) FROM VALUES(1, 2, 3) AS T(c1, c2, c3);
  {"c1":1,"col2":{"c2":2,"c3":3}}