median aggregate function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.2 and above

Returns the median calculated from values of a group.

Syntax

median ( [ALL | DISTINCT] expr ) [FILTER ( WHERE cond ) ]

This function can also be invoked as a window function using the OVER clause.

Arguments

  • expr: An expression that evaluates to a numeric or an interval.

  • cond: An optional boolean expression filtering the rows used for aggregation.

Returns

The following explains how the result types are computed:

  • year-month interval: The result is an INTERVAL YEAR TO MONTH.

  • day-time interval: The result is an INTERVAL DAY TO SECOND.

  • In all other cases the result is a DOUBLE.

Nulls within the group are ignored. If a group is empty or consists only of nulls, the result is NULL.

If DISTINCT is specified, duplicates are removed and the median is computed.

This function is a synonym for percentile_cont(0.5) WITHIN GROUP (ORDER BY expr).

Examples

> SELECT median(col) FROM VALUES (1), (2), (2), (3), (4), (NULL) AS tab(col);
 2.0

> SELECT median(DISTINCT col) FROM VALUES (1), (2), (2), (3), (4), (NULL) AS tab(col);
 2.5

> SELECT median(col) FROM VALUES (INTERVAL '1' YEAR), (INTERVAL '2' YEAR) AS tab(col);
 1-6