vector_normalize function
Applies to: Databricks Runtime 18.1 and above
Normalizes a vector to unit length using the specified norm degree.
Syntax
vector_normalize(vector [, degree ])
Arguments
- vector: An
ARRAY<FLOAT>expression representing the vector. - degree: Optional. A
FLOATvalue specifying the norm type; defaults to 2.0 (Euclidean norm). Supported values:- 1.0 — L1 norm: absolute values of components sum to 1
- 2.0 — L2 norm: Euclidean length equals 1
float('inf')— L∞ norm: maximum absolute component equals 1
Returns
An ARRAY<FLOAT> representing the normalized vector with the same direction as the input but with norm 1.0 under the specified degree.
Returns an empty array for empty vectors. Returns NULL if the vector has zero norm (e.g. all zeros) or if the input is NULL or contains NULL.
Notes
- Only
ARRAY<FLOAT>is supported; other types such asARRAY<DOUBLE>orARRAY<DECIMAL>raise an error. An unsupporteddegreevalue raises INVALID_VECTOR_NORM_DEGREE. - L2 normalization is the standard in dense embedding workloads.
Error conditions
Examples
SQL
-- L2 normalization (Euclidean) - 3-4-5 triangle
> SELECT vector_normalize(array(3.0f, 4.0f), 2.0f);
[0.6, 0.8]
-- Verify L2-normalized vector has unit norm
> SELECT vector_norm(vector_normalize(array(1.0f, 2.0f, 3.0f), 2.0f), 2.0f);
1.0
-- Zero vector returns NULL
> SELECT vector_normalize(array(0.0f, 0.0f, 0.0f), 2.0f);
NULL