Pular para o conteúdo principal

vector_normalize function

Applies to: check marked yes Databricks Runtime 18.1 and above

Normalizes a vector to unit length using the specified norm degree.

Syntax

vector_normalize(vector [, degree ])

Arguments

  • vector: An ARRAY<FLOAT> expression representing the vector.
  • degree: Optional. A FLOAT value specifying the norm type; defaults to 2.0 (Euclidean norm). Supported values:
    • 1.0 — L1 norm: absolute values of components sum to 1
    • 2.0 — L2 norm: Euclidean length equals 1
    • float('inf') — L∞ norm: maximum absolute component equals 1

Returns

An ARRAY<FLOAT> representing the normalized vector with the same direction as the input but with norm 1.0 under the specified degree.

Returns an empty array for empty vectors. Returns NULL if the vector has zero norm (e.g. all zeros) or if the input is NULL or contains NULL.

Notes

  • Only ARRAY<FLOAT> is supported; other types such as ARRAY<DOUBLE> or ARRAY<DECIMAL> raise an error. An unsupported degree value raises INVALID_VECTOR_NORM_DEGREE.
  • L2 normalization is the standard in dense embedding workloads.

Error conditions

Examples

SQL
-- L2 normalization (Euclidean) - 3-4-5 triangle
> SELECT vector_normalize(array(3.0f, 4.0f), 2.0f);
[0.6, 0.8]

-- Verify L2-normalized vector has unit norm
> SELECT vector_norm(vector_normalize(array(1.0f, 2.0f, 3.0f), 2.0f), 2.0f);
1.0

-- Zero vector returns NULL
> SELECT vector_normalize(array(0.0f, 0.0f, 0.0f), 2.0f);
NULL