schema_of_variant_agg
aggregate function
July 26, 2024
Applies to: Databricks SQL Databricks Runtime 15.3 and later
Returns the combined schema of all VARIANT
values in a group in DDL format.
Arguments
variantExpr
: AVARIANT
expression.cond
: An optionalBOOLEAN
expression filtering the rows used for aggregation.
Returns
A STRING
holding a schema definition of the variantExpr
.
The types in the schema are the derived formatted SQL types.
The schema of each VARIANT
value is merged together by field name. When two fields with the same name have a different type across records, Databricks uses the least common type.
When no such type exists, the type is derived as a VARIANT
. For example, INT
and DOUBLE
become DOUBLE
, while TIMESTAMP
and STRING
become VARIANT
.
To derive the schema of a single VARIANT
value, use schema_of_variant function.
Examples
-- Simple example
> SELECT schema_of_variant_agg(a) FROM VALUES(parse_json('{"foo": "bar"}')) AS data(a);
OBJECT<foo: STRING>
> SELECT schema_of_variant_agg(a) FROM VALUES(parse_json('[1]')) AS data(a);
ARRAY<BIGINT>
> CREATE TEMPORARY VIEW data(a) AS VALUES
(parse_json('{"foo": "bar", "wing": {"ding": "dong"}}')),
(parse_json('{"wing": 123}'));
> SELECT schema_of_variant_agg(a) FROM data;
OBJECT<foo: STRING, wing: VARIANT>