Skip to main content

covar_pop

Returns a new Column for the population covariance of col1 and col2.

Syntax

Python
from pyspark.sql import functions as sf

sf.covar_pop(col1, col2)

Parameters

Parameter

Type

Description

col1

pyspark.sql.Column or column name

First column to calculate covariance.

col2

pyspark.sql.Column or column name

Second column to calculate covariance.

Returns

pyspark.sql.Column: covariance of these two column values.

Examples

Python
from pyspark.sql import functions as sf
a = [1] * 10
b = [1] * 10
df = spark.createDataFrame(zip(a, b), ["a", "b"])
df.agg(sf.covar_pop("a", df.b)).show()
Output
+---------------+
|covar_pop(a, b)|
+---------------+
| 0.0|
+---------------+