hll_sketch_estimate
function
Applies to: Databricks SQL Databricks Runtime 13.3 LTS and above
This function utilizes the HyperLogLog algorithm to count a probabilistic approximation of the number of unique values in a given column, consuming a binary representation known as a sketch buffer previously generated by the hll_sketch_agg function and returning the result as a big integer.
The hll_union and hll_union_agg functions can also combine sketches together by consuming and merging these buffers as inputs.
The implementation uses the Apache Datasketches library. Please see HLL for more information.
Arguments
expr
: ABINARY
expression holding a sketch generated by hll_sketch_agg.