string_agg
aggregate function
Applies to: Databricks Runtime 16.4 and later
Returns concatenated STRING
and BINARY
values within a group.
This function is an alias for listagg function.
Syntax
string_agg ( [ALL | DISTINCT] expr [, delimiter] )
[ WITHIN GROUP (ORDER BY { sortKey [ sort_direction ] [ nulls_sort_oder ] } [, ...] ) ]
[ FILTER ( WHERE cond ) ]
sort_direction
[ ASC | DESC ]
nulls_sort_order
[ NULLS FIRST | NULLS LAST ]
This function can also be invoked as a window function using the OVER
clause if the WITHIN GROUP
clause and FILTER
clause are omitted.
Arguments
-
expr
An expression that evaluates to a
STRING
orBINARY
.NULL
values are ignored. -
delimiter
A constant expression matching the type of
expr
used to separate the concatenated values. The default is an empty string (''
) forSTRING
and a zero length binary (''::BINARY
) forBINARY
. -
ORDER BY
An expression used to order the values before concatenation. The default is the order of the rows in the group.
-
sortKey
An expression on which order is defined. The column references within
sortKey
must also be present inexpr
. -
sort_direction
Specifies the sort order for the order by expression.
ASC
: The sort direction for this expression is ascending.DESC
: The sort order for this expression is descending.
If sort direction is not explicitly specified, then by default rows are sorted ascending.
-
nulls_sort_order
Optionally specifies whether NULL values are returned before/after non-NULL values. If
null_sort_order
is not specified, then NULLs sort first if sort order isASC
and NULLS sort last if sort order isDESC
.NULLS FIRST
:NULL
values are returned first regardless of the sort order.NULLS LAST
:NULL
values are returned last regardless of the sort order.
-
-
cond
: An optionalBOOLEAN
expression filtering the rows used for aggregation.
Returns
A STRING
if expr
is a STRING
, BINARY
otherwise.
If DISTINCT
is specified only unique values are aggregated and the sortKey
must match expr
.
Examples
-- Simple example with default delimiter
> SELECT string_agg(col) FROM VALUES ('a'), ('b'), ('c') AS tab(col);
acb
-- Simple example with explicit delimiter
> SELECT string_agg(col, ', ') FROM VALUES ('a'), ('b'), ('c') AS tab(col);
b, a, c
-- Example with nulls
> SELECT string_agg(col) FROM VALUES ('a'), (NULL), ('c') AS tab(col);
ac
-- Example with explicit order
> SELECT string_agg(col) WITHIN GROUP (ORDER BY col DESC)
FROM VALUES ('a'), ('b'), ('c') AS tab(col);
cba
-- Example with DISTINCT
> SELECT string_agg(DISTINCT col)
FROM VALUES ('a'), ('a'), ('b') AS tab(col);
ab
-- Example of FUNCTION_AND_ORDER_EXPRESSION_MISMATCH
> SELECT string_agg(DISTINCT col) WITHIN GROUP (ORDER BY id DESC)
FROM VALUES (1, 'a'), (2, 'b'), (3, 'c') AS tab(id, col);
Error: [INVALID_WITHIN_GROUP_EXPRESSION.MISMATCH_WITH_DISTINCT_INPUT
-- Example with `BINARY`
> SELECT hex(string_agg(col::binary, x'00'))
FROM VALUES (1, 'a'), (2, 'b'), (3, 'c') AS tab(id, col);
6100630062