kll_sketch_get_rank_bigint function
Applies to: Databricks Runtime 18.0 and later
Estimates the normalized rank (0.0 to 1.0) of a given value in an integer KLL sketch.
Syntax
kll_sketch_get_rank_bigint ( sketch, value )
Arguments
sketch: ABINARYexpression containing a serialized integer KLL sketch.value: ABIGINTexpression orARRAY<BIGINT>of values to find ranks for.
Returns
- If value is
BIGINT: returns aDOUBLEbetween 0.0 and 1.0 representing the normalized rank. - If value is
ARRAY<BIGINT>: returnsARRAY<DOUBLE>with ranks for each value.
Notes
- The rank represents the fraction of values in the sketch that are less than or equal to the given value.
- Returns 0.0 if all sketch values are greater than the input value.
- Returns 1.0 if all sketch values are less than or equal to the input value.
Examples
SQL
-- Find what percentile a value falls into
> WITH sketch_data AS (
SELECT kll_sketch_agg_bigint(value) AS sketch
FROM VALUES (1), (2), (3), (4), (5) AS T(value)
)
SELECT kll_sketch_get_rank_bigint(sketch, 3) FROM sketch_data
0.6
-- Check ranks for multiple values
> WITH sketch_data AS (
SELECT kll_sketch_agg_bigint(value) AS sketch
FROM VALUES (10), (20), (30), (40), (50) AS T(value)
)
SELECT kll_sketch_get_rank_bigint(sketch, array(15, 25, 35)) FROM sketch_data
[0.2, 0.4, 0.6]
-- SLA compliance: what fraction of requests are under threshold
> WITH sketch_data AS (
SELECT kll_sketch_agg_bigint(response_time_ms) AS sketch FROM requests
)
SELECT kll_sketch_get_rank_bigint(sketch, 100) AS under_100ms_fraction FROM sketch_data
0.87