percentile_disc aggregate function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.0 and above

Returns the value that corresponds to the percentile of the provided sortKey using a discrete distribution model.

Syntax

percentile_disc ( percentile )
       WITHIN GROUP (ORDER BY sortKey [ASC | DESC] )

This function can also be invoked as a window function using the OVER clause.

Arguments

  • percentile: A numeric literal between 0 and 1 or a literal array of numeric literals, each between 0 and 1.

  • sortKey: A numeric expression over which the percentile is computed.

  • ASC or DESC: Optionally specify whether the percentile is computed using ascending or descending order. The default is ASC.

Returns

DOUBLE if percentile is numeric, or an ARRAY of DOUBLE if percentile is an ARRAY.

The aggregate function returns the sortKey value that matches the percentile within the group of sortKeys.

Examples

-- Return the median, 40%-ile and 10%-ile.
> SELECT percentile_disc(array(0.5, 0.4, 0.1)) WITHIN GROUP (ORDER BY col)
    FROM VALUES (0), (1), (2), (10) AS tab(col);
 [1, 1, 0]

-- Return the interpolated median.
> SELECT percentile_disc(0.50) WITHIN GROUP (ORDER BY col)
    FROM VALUES (0), (6), (6), (7), (9), (10) AS tab(col);
 6