Skip to main content

vector_avg

Aggregate function: returns the element-wise mean of float vectors in a group. All vectors must have the same dimension.

For the corresponding Databricks SQL function, see vector_avg aggregate function.

Syntax

Python
from pyspark.sql import functions as dbf

dbf.vector_avg(col=<col>)

Parameters

Parameter

Type

Description

col

pyspark.sql.Column or column name

Input vector column.

Returns

pyspark.sql.Column: The element-wise average vector as an array of floats.

Examples

Python
from pyspark.sql import functions as dbf
from pyspark.sql.types import ArrayType, FloatType, StructType, StructField

schema = StructType([StructField('v', ArrayType(FloatType()))])
df = spark.createDataFrame([([1.0, 2.0],), ([3.0, 4.0],)], schema)
df.select(dbf.vector_avg('v')).first()[0]
# [2.0, 3.0]