to_avro function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 15.4 and later

Returns a Avro binary value with the specified input value.

Syntax

to_avro(expr [, avroSchemaSpec] )

Arguments

  • expr: An expression.

  • avroSchemaSpec: An optional target schema in JSON format. If specified it must match the expr type as specified in Notes.

Returns

An Avro encoded BINARY.

Notes

The mapping of SQL types to Avro types is as follows:

SQL type

Avro schema

VOID

{ "type" : "null" }

BOOLEAN

{ "type" : "boolean" }

TINYINT

{ "type" : "int" }

SMALLINT

{ "type" : "int" }

INT

{ "type" : "int" }

BIGINT

{ "type" : "long" }

DECIMAL(p, s)

{ "type": "fixed", "name": "a", logicalType": "decimal", "size": (p+1)/2, precision": p, "scale": s }

FLOAT

{ "type" : "float" }

DOUBLE

{ "type" : "float" }

STRING

{ "type" : "string" }

DATE

{ "type" : "int", "logicalType" : "date" }

TIMESTAMP

{ "type" : "long" } as microseconds since 1970-01-01 00:00:00.000000

TIMESTAMP_NTZ

{ "type" : "long" } as microseconds since 1970-01-01 00:00:00.000000

YEAR MONTH INTERVAL

{ "type" : "long" } as months

DAY TIME INTERVAL

{ "type" : "long" } as microseconds

BINARY

{ "type" : "bytes" }

STRUCT<field1 type1, ...>

{ "type" : "record", "name": "struct_name", "fields": [ { "name" : "field1", "type" : ... }, ... ] }

ARRAY<type>

{ "type" : "array", "items": { "type" : ... }

MAP<STRING, valueType>

{ "type" : "map", "keyType": { "type" : ... }, "valueType": { "type" : ... } }

MAP<nonStringType, valueType>

Not supported

VARIANT

Not supported

Examples

> SELECT from_avro(to_avro(5), '{ "type" : "int" }');
  5

> SELECT from_avro(to_avro(5, '{ "type" : "int" }'), '{ "type" : "int" }');
  5

> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')), '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "string"}]}');
  {"num":5,"txt":"hello"}