メインコンテンツまでスキップ

theta_union function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 18.0 and above

Merges exactly two Theta Sketch binary representations using set union.

Syntax

theta_union ( first, second [, lgNomEntries ] )

Arguments

  • first: A Theta Sketch in binary format.
  • second: A Theta Sketch in binary format.
  • lgNomEntries: An optional INTEGER literal specifying the log-base-2 of the nominal entries for the union buffer. Must be between 4 and 26, inclusive. The default is 12.

Returns

A BINARY value containing the serialized Theta Sketch representing the union of the two input sketches.

Notes

  • The union operation handles input sketches with different lgNomEntries values.
  • To merge more than two sketches, use the aggregate theta_union_agg aggregate function function instead.

Error messages

Examples

SQL
-- Union two sketches
> SELECT theta_sketch_estimate(theta_union(theta_sketch_agg(col1), theta_sketch_agg(col2)))
FROM VALUES (1, 4), (1, 4), (2, 5), (2, 5), (3, 6) tab(col1, col2);
6