cast function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

Casts the value expr to the target data type type. This operator is a synonym for :: (colon colon sign) operator

Syntax

cast(sourceExpr AS targetType)

Arguments

  • sourceExpr: Any castable expression.

  • targetType: The data type of the result.

Returns

The result is type targetType.

The following combinations of data type casting are valid:

Source (row) Target(column)

VOID

numeric

STRING

DATE

TIMESTAMP

TIMESTAMP_NTZ

year-month interval

day-time interval

BOOLEAN

BINARY

ARRAY

MAP

STRUCT

VOID

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

numeric

N

Y

Y

N

Y

N

Y

Y

Y

N

N

N

N

STRING

N

Y

Y

Y

Y

Y

Y

Y

Y

Y

N

N

N

DATE

N

N

Y

Y

Y

Y

N

N

N

N

N

N

N

TIMESTAMP

N

Y

Y

Y

Y

Y

N

N

N

N

N

N

N

TIMESTAMP_NTZ

N

N

Y

Y

Y

Y

N

N

N

N

N

N

N

year-month interval

N

Y

Y

N

N

N

Y

N

N

N

N

N

N

day-time interval

N

Y

Y

N

N

N

N

Y

N

N

N

N

N

BOOLEAN

N

Y

Y

N

Y

N

N

N

Y

N

N

N

N

BINARY

N

Y

Y

N

N

N

N

N

N

Y

N

N

N

ARRAY

N

N

Y

N

N

N

N

N

N

N

Y

N

N

MAP

N

N

Y

N

N

N

N

N

N

N

N

Y

N

STRUCT

N

N

Y

N

N

N

N

N

N

N

N

N

Y

Rules and limitations based on targetType

Warning

In Databricks Runtime, if spark.sql.ansi.enabled is false, an overflow will not cause an error but instead will “wrap” the result.

A sourceExpr value with an invalid format or invalid characters for targetType will result in a NULL.

numeric

If the targetType is a numeric and sourceExpr is of type:

  • VOID

    The result is a NULL of the specified numeric type.

  • numeric

    If targetType is an integral numeric, the result is sourceExpr truncated to a whole number.

    Otherwise, the result is sourceExpr rounded to a fit the available scale of targetType.

    If the value is outside the range of targetType, an overflow error is raised.

    Use try_cast to turn overflow errors into NULL.

  • STRING

    sourceExpr is read as a literal value of the targetType.

    If sourceExpr doesn’t comply with the format for literal values, an error is raised.

    If the value is outside the range of the targetType, an overflow error is raised.

    Use try_cast to turn overflow and invalid format errors into NULL.

  • TIMESTAMP

    The result is the number of seconds elapsed between 1970-01-01 00:00:00 UTC and sourceExpr.

    If targetType is an integral numeric, the result is truncated to a whole number.

    Otherwise, the result is rounded to a fit the available scale of targetType.

    If the result is outside the range of targetType, an overflow error is raised.

    Use try_cast to turn overflow errors into NULL.

  • INTERVAL

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.2 and above

    Target type must be an exact numeric.

    Given an INTERVAL upper_unit TO lower_unit the result is measured in total number of lower_unit. If the lower_unit is SECOND, fractional seconds are stored to the right of the decimal point. For all other intervals the result is always an integral number.

  • BOOLEAN

    If sourceExpr is:

    • true: The result is 0.

    • false: The result is 1.

    • NULL: The result is NULL.

Examples

> SELECT cast(NULL AS INT);
  NULL

> SELECT cast(5.6 AS INT);
  5

> SELECT cast(5.6 AS DECIMAL(2, 0));
  6

> SELECT cast(-5.6 AS INT);
  -5

> SELECT cast(-5.6 AS DECIMAL(2, 0));
  -6

> SELECT cast(128 AS TINYINT);
  Overflow

> SELECT cast(128 AS DECIMAL(2, 0));
  Overflow

> SELECT cast('123' AS INT);
  123

> SELECT cast('123.0' AS INT);
  Invalid format

> SELECT cast(TIMESTAMP'1970-01-01 00:00:01' AS LONG);
  1

> SELECT cast(TIMESTAMP'1970-01-01 00:00:00.000001' AS DOUBLE);
  1.0E-6

> SELECT cast(TIMESTAMP'2022-02-01 00:00:00' AS SMALLINT);
  error: overflow
> SELECT cast(true AS BOOLEAN);
  1

> SELECT cast(INTERVAL '1-2' YEAR TO MONTH AS INTEGER);
  14

> SELECT cast(INTERVAL '1:30.5' MINUTE TO SECOND AS DECIMAL(5, 2));
  90.50

> SELECT cast(TRUE AS INT);
  1

> SELECT cast(FALSE AS INT);
  0

STRING

If the targetType is a STRING type and sourceExpr is of type:

  • VOID

    The result is a NULL string.

  • exact numeric

    The result is the literal number with an optional minus-sign and no leading zeros except for the single digit to the left of the decimal point. If the targetType is DECIMAL(p, s) with s greater 0, a decimal point is added and trailing zeros are added up to scale.

  • floating-point binary

    If the absolute number is less that 10,000,000 and greater or equal than 0.001, the result is expressed without scientific notation with at least one digit on either side of the decimal point.

    Otherwise, Databricks uses a mantissa followed by E and an exponent. The mantissa has an optional leading minus sign followed by one digit to the left of the decimal point, and the minimal number of digits greater than zero to the right. The exponent has and optional leading minus sign.

  • DATE

    If the year is between 9999 BCE and 9999 CE, the result is a dateString of the form -YYYY-MM-DD and YYYY-MM-DD respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

  • TIMESTAMP

    If the year is between 9999 BCE and 9999 CE, the result is a timestampString of the form -YYYY-MM-DD hh:mm:ss and YYYY-MM-DD hh:mm:ss respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

    Fractional seconds .f... are added if necessary.

  • TIMESTAMP_NTZ

    If the year is between 9999 BCE and 9999 CE, the result is a timestampString of the form -YYYY-MM-DD hh:mm:ss and YYYY-MM-DD hh:mm:ss respectively.

    For years prior or after this range, the necessary number of digits are added to the year component and + is used for CE.

    Fractional seconds .f... are added if necessary.

  • year-month interval

    The result is its shortest representation of the interval literal. If the interval is negative, the sign is embedded in the interval-string. For units smaller than 10,leading zeros are omitted.

    A typical year-month interval string has the form:

    • INTERVAL 'Y' YEAR

    • INTERVAL 'Y-M' YEAR TO MONTH

    • INTERVAL 'M' MONTH

  • day-time interval

    The result is its shortest representation of the interval literal. If the interval is negative, the sign is embedded in the interval-string. For units smaller than 10, leading zeros are omitted.

    A typical day time interval string has the form:

    • INTERVAL 'D' DAY

    • INTERVAL 'D h' DAY TO HOUR

    • INTERVAL 'D h:m' DAY TO MINUTE

    • INTERVAL 'D h:m:s' DAY TO SECOND

    • INTERVAL 'h' HOUR

    • INTERVAL 'h:m' HOUR TO MINUTE

    • INTERVAL 'm:s' MINUTE TO SECOND

    • INTERVAL 's' SECOND

  • BOOLEAN

    The result of the true boolean is the STRING literal true, for false it’s the STRING literal false, and for NULL it’s the NULL string.

  • BINARY

    A result is the binary sourceExpr interpreted as a UTF-8 character sequence.

    Databricks doesn’t validate the UTF-8 characters. A cast from BINARY to STRING will never inject substitution characters or raise an error.

  • ARRAY

    The result is a comma separated list of cast elements, which is braced with square brackets [ ]. One space follows each comma. A NULL element is translated to a literal null.

    Databricks doesn’t quote or otherwise mark individual elements, which may themselves contain brackets or commas.

  • MAP

    The result is a comma separated list of cast key value pairs, which is braced with curly braces { }. One space follows each comma. Each key value pair is separated by a -> . A NULL map value is translated to literal null.

    Databricks doesn’t quote or otherwise mark individual keys or values, which may themselves may contain curly braces, commas or ->.

  • STRUCT

    The result is a comma separated list of cast field values, which is braced with curly braces { }. One space follows each comma. A NULL field value is translated to a literal null.

    Databricks doesn’t quote or otherwise mark individual field values, which may themselves may contain curly braces, or commas.

Examples

> SELECT cast(NULL AS STRING);
  NULL

> SELECT cast(-3Y AS STRING);
  -3

> SELECT cast(5::DECIMAL(10, 5) AS STRING);
  5.00000

> SELECT cast(12345678e-4 AS STRING);
  1234.5678

> SELECT cast(1e7 as string);
  1.0E7

> SELECT cast(1e6 as string);
  1000000.0

> SELECT cast(1e-4 as string);
  1.0E-4

> SELECT cast(1e-3 as string);
  0.001

> SELECT cast(12345678e7 AS STRING);
  1.2345678E14

> SELECT cast(DATE'1900-12-31' AS STRING);
  1900-12-31

-- Caesar no more
> SELECT cast(DATE'-0044-03-15' AS STRING);
  -0044-03-15

> SELECT cast(DATE'100000-12-31' AS STRING);
  +100000-12-31

> SELECT cast(current_timestamp() AS STRING);
  2022-04-02 22:29:09.783

> SELECT cast(TIMESTAMP_NTZ'2023-01-01' AS STRING);
  2023-01-01 00:00:00

> SELECT cast(INTERVAL -'13-02' YEAR TO MONTH AS STRING);
  INTERVAL '-13-2' YEAR TO MONTH

> SELECT cast(INTERVAL '12:04.9900' MINUTE TO SECOND AS STRING);
  INTERVAL '12:04.99' MINUTE TO SECOND

> SELECT cast(true AS STRING);
  true

> SELECT cast(false AS STRING);
  false

-- A bad UTF-8 string
> SELECT cast(x'33800033' AS STRING);
  33

> SELECT hex(cast(x'33800033' AS STRING));
  33800033

> SELECT cast(array('hello', NULL, 'world') AS STRING);
  [hello, null, world]

> SELECT cast(array('hello', 'wor, ld') AS STRING);
  [hello, wor, ld]

> SELECT cast(array() AS STRING);
  []

> SELECT cast(map('hello', 1, 'world', null) AS STRING);
  {hello -> 1, world -> null}

> SELECT cast(map('hello -> 1', DATE'2022-01-01') AS STRING);
  {hello -> 1 -> 2022-01-01}

> SELECT cast(map() AS STRING);
  {}

> SELECT cast(named_struct('a', 5, 'b', 6, 'c', NULL) AS STRING);
  {5, 6, null}

> SELECT cast(named_struct() AS STRING);
  {}

DATE

If the targetType is a DATE type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • STRING

    sourceExpr must be a valid dateString.

    If sourceExpr is not a valid dateString, Databricks returns an error.

    Use try_cast to turn invalid data errors into NULL.

  • TIMESTAMP

    The result is date portion of the timestamp sourceExpr.

  • TIMESTAMP_NTZ

    The result is date portion of the timestamp_ntz sourceExpr.

Examples

> SELECT cast(NULL AS DATE);
  NULL

> SELECT cast('1900-10-01' AS DATE);
  1900-10-01

> SELECT cast('1900-10-01' AS DATE);
  1900-10-01

-- There is no February 30.
> SELECT cast('1900-02-30' AS DATE);
  Error

> SELECT cast(TIMESTAMP'1900-10-01 12:13:14' AS DATE);
  1900-10-01

> SELECT cast(TIMESTAMP_NTZ'1900-10-01 12:13:14' AS DATE);
  1900-10-01

TIMESTAMP

If the targetType is a TIMESTAMP type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • numeric

    sourceExpr is read as the number of seconds since 1970-01-01 00:00:00 UTC.

    Fractions smaller than microseconds are truncated.

    If the value is outside of the range of TIMESTAMP, an overflow error is raised.

    Use try_cast to turn overflow errors into NULL.

  • STRING

    sourceExpr must be a valid timestampString.

    If sourceExpr is not a valid timestampString, Databricks returns an error.

    Use try_cast to turn invalid data errors into NULL.

  • DATE

    The result is the sourceExpr DATE at 00:00:00hrs.

  • TIMESTAMP_NTZ

The result is a timestamp value with the same year/month/day/hour/minute/second fields of timestamp_ntz sourceExpr.

Examples

> SELECT cast(NULL AS TIMESTAMP);
  NULL

> SET TIME ZONE '+00:00';
> SELECT cast(0.0 AS TIMESTAMP);
  1970-01-01 00:00:00

> SELECT cast(0.0000009 AS TIMESTAMP);
  1970-01-01 00:00:00

> SELECT cast(1e20 AS TIMESTAMP);
  Error: overflow

> SELECT cast('1900' AS TIMESTAMP);
  1900-01-01 00:00:00

> SELECT cast('1900-10-01 12:13:14' AS TIMESTAMP);
  1900-10-01 12:13:14

> SELECT cast('1900-02-30 12:13:14' AS TIMESTAMP);
  Error

> SELECT cast(DATE'1900-10-01' AS TIMESTAMP);
  1900-10-01 00:00:00

> SELECT cast(TIMESTAMP_NTZ'2023-01-01 02:03:04.567' as TIMESTAMP)
  2023-01-01 02:03:04.567

TIMESTAMP_NTZ

If the targetType is a TIMESTAMP_NTZ type and sourceExpr is of type:

  • VOID

    The result is a NULL DATE.

  • STRING

    sourceExpr must be a valid timestampString.

    If sourceExpr is not a valid timestampString, Databricks returns an error.

    Use try_cast to turn invalid data errors into NULL.

  • DATE

    The result is the sourceExpr DATE at 00:00:00hrs.

  • TIMESTAMP

The result is local time as the sourceExpr in the session time zone.

Examples

> SELECT cast(NULL AS TIMESTAMP_NTZ);
  NULL

> SELECT cast('1900' AS TIMESTAMP_NTZ);
  1900-01-01 00:00:00

> SELECT cast('1900-10-01 12:13:14' AS TIMESTAMP_NTZ);
  1900-10-01 12:13:14

> SELECT cast('1900-02-30 12:13:14' AS TIMESTAMP_NTZ);
  Error

> SELECT cast(DATE'1900-10-01' AS TIMESTAMP_NTZ);
  1900-10-01 00:00:00

> SELECT current_timezone(), CAST(TIMESTAMP'2021-7-1T8:43:28' as TIMESTAMP_NTZ);
  America/Los_Angeles 2021-07-01 08:43:28

> SELECT current_timezone(), CAST(TIMESTAMP'2021-7-1T8:43:28UTC+3' as TIMESTAMP_NTZ);
  America/Los_Angeles 2021-06-30 22:43:28

year-month interval

If the targetType is a year-month interval and sourceExpr is of type:

Examples

> SELECT cast(NULL AS INTERVAL YEAR);
  NULL

> SELECT cast('1-4' AS INTERVAL YEAR TO MONTH)::STRING;
  INTERVAL '1-4' YEAR TO MONTH

> SELECT cast('1' AS INTERVAL YEAR TO MONTH);
  error

> SELECT cast(INTERVAL '1-4' YEAR TO MONTH AS INTERVAL MONTH)::STRING;
  INTERVAL '16' MONTH

> SELECT cast(14 AS INTERVAL YEAR TO MONTH)::STRING;
  INTERVAL '1-2' YEAR TO MONTH

> SELECT cast(INTERVAL '1-11' YEAR TO MONTH AS INTERVAL YEAR)::STRING;
  INTERVAL '1' YEAR

day-time interval

If the targetType is a day-time interval and sourceExpr is of type:

  • VOID

    The result is a NULL day-time interval.

  • exact_numeric

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 11.2 and above

    The numeric is interpreted as number of lower units of the targetType dayTimeIntervalQualifier. If the unit is SECOND any fractions are interpreted as fractional seconds.

  • STRING

    sourceExpr must be a valid dayTimeIntervalString.

    If sourceExpr is not a valid dayTimeIntervalString, Databricks returns an error.

    Use try_cast to turn invalid data errors into NULL.

  • day-time interval

    If the targetType dayTimeIntervalQualifier includes the smallest unit of the source type dayTimeIntervalQualifier, the value remains unchanged, but is reinterpreted to match the target type.

    Otherwise, the sourceExpr interval is truncated to fit the targetType.

> SELECT cast(NULL AS INTERVAL HOUR);
  NULL

> SELECT cast('1 4:23' AS INTERVAL DAY TO MINUTE)::STRING;
  INTERVAL '1 04:23' DAY TO MINUTE

> SELECT cast('1' AS INTERVAL DAY TO MINUTE);
  error

> SELECT cast(INTERVAL '1 4:23' DAY TO MINUTE AS INTERVAL MINUTE)::STRING;
  INTERVAL '1703' MINUTE

> SELECT cast(INTERVAL '1 4:23' DAY TO MINUTE AS INTERVAL HOUR)::STRING;
  INTERVAL '28' HOUR

> SELECT cast(125.3 AS INTERVAL MINUTE TO SECOND)::STRING;
  INTERVAL '2:5.3' MINUTE TO SECOND

BOOLEAN

If the targetType is a BOOLEAN and sourceExpr is of type:

  • VOID

    The result is a NULL Boolean.

  • numeric

    If sourceExpr is:

  • STRING

    If sourcEexpr is (case insensitive):

    • 'T', 'TRUE', 'Y', 'YES', or '1': The result is true

    • 'F', 'FALSE', 'N', 'NO', or '0': The result is false

    • NULL: The result is NULL

    Otherwise, Databricks returns an invalid input syntax for type boolean error.

    Use try_cast to turn invalid data errors into NULL.

Examples

> SELECT cast(NULL AS BOOLEAN);
  NULL

> SELECT cast('T' AS BOOLEAN);
  true

> SELECT cast('True' AS BOOLEAN);
  true

> SELECT cast('1' AS BOOLEAN);
  true

> SELECT cast('0' AS BOOLEAN);
  false

> SELECT cast('n' AS BOOLEAN);
  false

> SELECT cast('on' AS BOOLEAN);
  error: invalid input syntax for type boolean

> SELECT cast(0 AS BOOLEAN);
  false

> SELECT cast(0.0E10 AS BOOLEAN);
  false

> SELECT cast(1 AS BOOLEAN);
  true

> SELECT cast(0.1 AS BOOLEAN);
  true

> SELECT cast('NaN'::FLOAT AS BOOLEAN);
  true

BINARY

If the targetType is a BINARY and sourceExpr is of type:

  • VOID

    The result is a NULL Binary.

  • STRING

    The result is the UTF-8 encoding of the surceExpr.

Examples

> SELECT cast(NULL AS BINARY);
  NULL

> SELECT hex(cast('Spark SQL' AS BINARY));
  537061726B2053514C

> SELECT hex(cast('Oдesa' AS BINARY));
  4FD0B4657361

ARRAY

If the targetType is an ARRAY < targetElementType > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targeType.

  • ARRAY < sourceElementType >

    If the cast from sourceElementType to targetElementType is supported, the result is an ARRAY<targetElementType> with all elements cast to the targetElementType.

    Databricks raises an error if the cast isn’t supported or if any of the elements can’t be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples

> SELECT cast(NULL AS ARRAY<INT>);
  NULL

> SELECT cast(array('t', 'f', NULL) AS ARRAY<BOOLEAN>);
  [true, false, NULL]

> SELECT cast(array('t', 'f', NULL) AS INTERVAL YEAR);
  error: cannot cast array<string> to interval year

> SELECT cast(array('t', 'f', 'o') AS ARRAY<BOOLEAN>);
  error: invalid input syntax for type boolean: o.

MAP

If the targetType is an MAP < targetKeyType, targetValueType > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targetType.

  • MAP <sourceKeyType, sourceValueType >

    If the casts from sourceKeyType to targetKeyType and sourceValueType to targetValueType are supported, the result is an MAP<targetKeyType, targetValueType> with all keys cast to the targetKeyType and all values cast to the targetValueType.

    Databricks raises an error if the cast isn’t supported or if any of the keys or values can’t be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples

> SELECT cast(NULL AS MAP<STRING, INT>);
  NULL

> SELECT cast(map('10', 't', '15', 'f', '20', NULL) AS MAP<INT, BOOLEAN>);
  {10:true,15:false,20:null}

> SELECT cast(map('10', 't', '15', 'f', '20', NULL) AS MAP<INT, ARRAY<INT>>);
  error: cannot cast map<string,string> to map<int,array<int>>

> SELECT cast(map('10', 't', '15', 'f', '20', 'o') AS MAP<INT, BOOLEAN>);
  error: invalid input syntax for type boolean: o.

STRUCT

If the targetType is a STRUCT <[targetFieldName : targetFieldType [NOT NULL] [COMMENT str] [, …]] > and sourceExpr is of type:

  • VOID

    The result is a NULL of the targetType.

  • STRUCT < [sourceFieldName : sourceFieldType [NOT NULL] [COMMENT str] [, …]] >

    The sourceExpr can be cast to targetType if all of thee conditions are true:

    • The source type has the same number of fields as the target

    • For all fields: sourceFieldTypeN can be cast to the targetFieldTypeN.

    • For all field values: The source field value N can be cast to targetFieldTypeN and the value isn’t null if target field N is marked as NOT NULL.

    sourceFieldNames, source NOT NULL constraints, and source COMMENTs need not match the targetType and are ignored.

    Databricks raises an error if the cast isn’t supported or if any of the keys or values can’t be cast.

    Use try_cast to turn invalid data or overflow errors into NULL.

Examples

> SELECT cast(NULL AS STRUCT<a:INT>);
  NULL

> SELECT cast(named_struct('a', 't', 'b', '1900') AS STRUCT<b:BOOLEAN, c:DATE NOT NULL COMMENT 'Hello'>);
  {"b":true,"c":1900-01-01}

> SELECT cast(named_struct('a', 't', 'b', NULL::DATE) AS STRUCT<b:BOOLEAN, c:DATE NOT NULL COMMENT 'Hello'>);
  error: cannot cast struct<a:string,b:date> to struct<b:boolean,c:date>

> SELECT cast(named_struct('a', 't', 'b', '1900') AS STRUCT<b:BOOLEAN, c:ARRAY<INT>>);
  error: cannot cast struct<a:string,b:string> to struct<b:boolean,c:array<int>>

> SELECT cast(named_struct('a', 't', 'b', 'hello') AS STRUCT<b:BOOLEAN, c:DATE>);
  error: Cannot cast hello to DateType