メインコンテンツまでスキップ

kll_sketch_agg_bigint aggregate function

Applies to: check marked yes Databricks Runtime 18.0 and later

Creates a KLL (K-Linear-Logarithmic) sketch for approximate quantile estimation on integer data with configurable accuracy.

Syntax

kll_sketch_agg_bigint ( expr [, k] )

Arguments

  • expr: An integral numeric expression to aggregate.
  • k: An optional INTEGER literal controlling sketch accuracy. Must be between 8 and 65535. The default is 200. Higher values provide better accuracy but use more memory.

Returns

A BINARY value containing the serialized KLL sketch for integer data.

Notes

  • NULL values in expr are ignored during aggregation.
  • The sketch provides approximate quantiles with a confidence level of about 99%.
  • Sketches are mergeable, allowing distributed aggregation.
  • Memory usage is approximately O(k) items regardless of input size.

Examples

SQL
-- Create sketch with default k=200
> SELECT kll_sketch_agg_bigint(value) FROM VALUES (1), (2), (3), (4), (5) AS T(value)
[binary data]

-- Create sketch with custom k=400 for higher accuracy
> SELECT kll_sketch_agg_bigint(value, 400) FROM VALUES (10), (20), (30) AS T(value)
[binary data]