regr_intercept aggregate function (Databricks SQL)

Returns the intercept of the uni-variate linear regression line in a group where xExpr and yExpr are NOT NULL.

Requires: SQL warehouse version 2022.35 or higher. This version is available in the Preview channel.

Syntax

regr_intercept( [ALL | DISTINCT] yExpr, xExpr) [FILTER ( WHERE cond ) ]

Arguments

  • yExpr: A numeric expression, the dependent variable.

  • xExpr: A numeric expression, the independent variable.

  • cond: An optional boolean expression filtering the rows used for the function.

Returns

A DOUBLE.

Any nulls within the group are ignored. If a group is empty or consists only of nulls, the result is NULL.

If DISTINCT is specified, the average is computed after duplicates are removed.

Thi function is a synonym for avg(y) - regr_slope(y,x) * avg(x).

Examples

> SELECT regr_intercept(y, x) FROM VALUES (1, 2), (2, 3), (2, 3), (null, 4), (4, null) AS T(y, x);
  0.7777777777777779