mean aggregate function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Returns the mean calculated from values of a group.

Syntax

mean ( [ALL | DISTINCT] expr ) [FILTER ( WHERE cond ) ]

This function can also be invoked as a window function using the OVER clause.

Arguments

  • expr: An expression that evaluates to a numeric.

  • cond: An optional boolean expression filtering the rows used for aggregation.

Returns

The result type is computed as for the arguments:

  • DECIMAL(p, s): The result type is a` DECIMAL(p + 4, s + 4)`. If the maximum precision for DECIMAL is reached, the increase in scale is limited to avoid loss of significant digits.

  • year-month interval: The result is an INTERVAL YEAR TO MONTH.

  • day-time interval: The result is an INTERVAL DAY TO SECOND.

  • In all other cases the result is a DOUBLE.

Nulls within the group are ignored. If a group is empty or consists only of nulls the result is NULL.

If DISTINCT is specified the mean is computed after duplicates have been removed.

This function is a synonym for avg aggregate function.

Warning

In Databricks Runtime, if spark.sql.ansi.enabled is false, an overflow returns NULL instead of an error.

Examples

> SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col);
 2.0

> SELECT mean(DISTINCT col) FROM VALUES (1), (1), (2), (NULL) AS tab(col);
 1.5

> SELECT mean(col) FROM VALUES (1), (2), (NULL) AS tab(col);
 1.5