approxCountDistinct
This aggregate function returns a new Column, which estimates the approximate distinct count of elements in a specified column or a group of columns. Supports Spark Connect.
warning
Deprecated in 2.1.0. Use approx_count_distinct instead.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.approxCountDistinct(col=<col>, rsd=<rsd>)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| The label of the column to count distinct values in. |
|
| The maximum allowed relative standard deviation (default = 0.05). |
Examples
See approx_count_distinct for examples.