regr_avgy aggregate function (Databricks SQL)

Returns the mean of yExpr calculated from values of a group where xExpr and yExpr are NOT NULL.

Syntax

regr_avgy( [ALL | DISTINCT] yExpr, xExpr) [FILTER ( WHERE cond ) ]

Arguments

  • yExpr: An numeric expression, the dependent variable.

  • xExpr: An numeric expression, the independent variable.

  • cond: An optional boolean expression filtering the rows used for the function.

Returns

The result type depends on the type of yExpr:

  • DECIMAL(p, s): The result type is a` DECIMAL(p + 4, s + 4)`. If the maximum precision for DECIMAL is reached the increase in scale will be limited to avoid loss of significant digits.

  • Otherwise the result is a DOUBLE.

Any nulls within the group are ignored. If a group is empty or consists only of nulls, the result is NULL.

If DISTINCT is specified the average is computed after duplicates have been removed.

regr_avgy(y, x) is a synonym for avg(y) FILTER(WHERE x IS NOT NULL AND y IS NOT NULL).

Examples

> SELECT regr_avgy(y, x) FROM VALUES (1, 2), (2, 3), (2, 3), (null, 4), (4, null) AS T(y, x);
  1.6666666666666667