メインコンテンツまでスキップ

vector_sum aggregate function

Applies to: check marked yes Databricks Runtime 18.1 and above

Computes the element-wise sum of vectors in an aggregate. Returns a vector where each element is the sum of the corresponding elements across all input vectors.

Syntax

vector_sum(vectors) [FILTER ( WHERE cond ) ]

Arguments

  • vectors: A column of ARRAY<FLOAT> expressions representing vectors. All vectors must have the same dimension.
  • cond: An optional boolean expression filtering the rows used for aggregation.

Returns

An ARRAY<FLOAT> value with the same dimension as the input vectors. Each element in the result is the sum of the corresponding elements across all input vectors.

NULL values and non-NULL vectors containing a NULL element are ignored in the aggregation. Returns NULL if all values in the group are invalid (NULL or non-NULL vectors with NULL elements). Returns an empty array [] if all input vectors are empty.

Notes

  • Only ARRAY<FLOAT> is supported; other types such as ARRAY<DOUBLE> or ARRAY<DECIMAL> raise an error.
  • All input vectors must have the same dimension; otherwise the function raises VECTOR_DIMENSION_MISMATCH.
  • A non-NULL vector that contains a NULL element is treated as NULL.

Error conditions

Examples

SQL
-- Element-wise sum per category (with GROUP BY)
> SELECT category, vector_sum(embedding) AS sum_vector
FROM vector_data
GROUP BY category
ORDER BY category;
category: A, sum_vector: [5.0, 7.0, 9.0]
category: B, sum_vector: [5.0, 3.0, 5.0]

-- Scalar aggregation (without GROUP BY)
> SELECT vector_sum(embedding) AS total_sum FROM vector_data;
total_sum: [10.0, 10.0, 14.0]