Data types
Applies to: Databricks SQL
Databricks Runtime
For rules governing how conflicts between data types are resolved, see SQL data type rules.
Supported data types
Databricks supports the following data types:
Data Type | Description |
---|---|
Represents 8-byte signed integer numbers. | |
Represents byte sequence values. | |
Represents Boolean values. | |
Represents values comprising values of fields year, month and day, without a time-zone. | |
Represents numbers with maximum precision | |
Represents 8-byte double-precision floating point numbers. | |
Represents 4-byte single-precision floating point numbers. | |
Represents 4-byte signed integer numbers. | |
Represents intervals of time either on a scale of seconds or months. | |
Represents the untyped NULL. | |
Represents 2-byte signed integer numbers. | |
Represents character string values. | |
Represents values comprising values of fields year, month, day, hour, minute, and second, with the session local timezone. | |
Represents values comprising values of fields year, month, day, hour, minute, and second. All operations are performed without taking any time zone into account. | |
Represents 1-byte signed integer numbers. | |
Represents values comprising a sequence of elements with the type of | |
Represents values comprising a set of key-value pairs. | |
STRUCT < [fieldName : fieldType [NOT NULL][COMMENT str][, …]] > | Represents values with the structure described by a sequence of fields. |
Represents semi-structured data. | |
Represents values in a |
Delta Lake does not support the VOID
type.
Data type classification
Data types are grouped into the following classes:
- Exact numeric types represent base-10 numbers:
- Binary floating point types use exponents and a binary representation to cover a large range of numbers:
- Numeric types represents all numeric data types:
- Date-time types represent date and time components:
- Simple types are types defined by holding singleton values:
- Complex types are composed of multiple components of complex or simple types:
Language mappings
Applies to: Databricks Runtime
- Scala
- Java
- Python
- R
Spark SQL data types are defined in the package org.apache.spark.sql.types
. You access them by importing the package:
import org.apache.spark.sql.types._
SQL type | Data type | Value type | API to access or create data type |
---|---|---|---|
ByteType | Byte | ByteType | |
ShortType | Short | ShortType | |
IntegerType | Int | IntegerType | |
LongType | Long | LongType | |
FloatType | Float | FloatType | |
DoubleType | Double | DoubleType | |
DecimalType | java.math.BigDecimal | DecimalType | |
StringType | String | StringType | |
BinaryType | Array[Byte] | BinaryType | |
BooleanType | Boolean | BooleanType | |
TimestampType | java.sql.Timestamp | TimestampType | |
TimestampNTZType | java.time.LocalDateTime | TimestampNTZType | |
DateType | java.sql.Date | DateType | |
YearMonthIntervalType | java.time.Period | YearMonthIntervalType (3) | |
DayTimeIntervalType | java.time.Duration | DayTimeIntervalType (3) | |
ArrayType | scala.collection.Seq | ArrayType(elementType [, containsNull]). (2) | |
MapType | scala.collection.Map | MapType(keyType, valueType [, valueContainsNull]). (2) | |
StructType | org.apache.spark.sql.Row | StructType(fields). fields is a Seq of StructField. 4. | |
StructField | The value type of the data type of this field(For example, Int for a StructField with the data type IntegerType) | StructField(name, dataType [, nullable]). 4 | |
VariantType | org.apache.spark.unsafe.type.VariantVal | VariantType | |
Not Supported | Not supported | Not supported |
Spark SQL data types are defined in the package org.apache.spark.sql.types
. To access or create a data type, use factory methods provided in org.apache.spark.sql.types.DataTypes
.
SQL type | Data Type | Value type | API to access or create data type |
---|---|---|---|
ByteType | byte or Byte | DataTypes.ByteType | |
ShortType | short or Short | DataTypes.ShortType | |
IntegerType | int or Integer | DataTypes.IntegerType | |
LongType | long or Long | DataTypes.LongType | |
FloatType | float or Float | DataTypes.FloatType | |
DoubleType | double or Double | DataTypes.DoubleType | |
DecimalType | java.math.BigDecimal | DataTypes.createDecimalType() DataTypes.createDecimalType(precision, scale). | |
StringType | String | DataTypes.StringType | |
BinaryType | byte[] | DataTypes.BinaryType | |
BooleanType | boolean or Boolean | DataTypes.BooleanType | |
TimestampType | java.sql.Timestamp | DataTypes.TimestampType | |
TimestampNTZType | java.time.LocalDateTime | DataTypes.TimestampNTZType | |
DateType | java.sql.Date | DataTypes.DateType | |
YearMonthIntervalType | java.time.Period | YearMonthIntervalType (3) | |
DayTimeIntervalType | java.time.Duration | DayTimeIntervalType (3) | |
ArrayType | ava.util.List | DataTypes.createArrayType(elementType [, containsNull]).(2) | |
MapType | java.util.Map | DataTypes.createMapType(keyType, valueType [, valueContainsNull]).(2) | |
StructType | org.apache.spark.sql.Row | DataTypes.createStructType(fields). fields is a List or array of StructField. 4 | |
StructField | The value type of the data type of this field (For example, int for a StructField with the data type IntegerType) | DataTypes.createStructField(name, dataType, nullable) 4 | |
VariantType | org.apache.spark.unsafe.type.VariantVal | VariantType | |
Not Supported | Not supported | Not supported |
Spark SQL data types are defined in the package pyspark.sql.types
. You access them by importing the package:
from pyspark.sql.types import *
SQL type | Data type | Value type | API to access or create data type |
---|---|---|---|
ByteType | int or long. (1) | ByteType() | |
ShortType | int or long. (1) | ShortType() | |
IntegerType | int or long | IntegerType() | |
LongType | long (1) | LongType() | |
FloatType | float (1) | FloatType() | |
DoubleType | float | DoubleType() | |
DecimalType | decimal.Decimal | DecimalType() | |
StringType | string | StringType() | |
BinaryType | bytearray | BinaryType() | |
BooleanType | bool | BooleanType() | |
TimestampType | datetime.datetime | TimestampType() | |
TimestampNTZType | datetime.datetime | TimestampNTZType() | |
DateType | datetime.date | DateType() | |
YearMonthIntervalType | Not supported | Not supported | |
DayTimeIntervalType | datetime.timedelta | DayTimeIntervalType (3) | |
ArrayType | list, tuple, or array | ArrayType(elementType, [containsNull]).(2) | |
MapType | dict | MapType(keyType, valueType, [valueContainsNull]).(2) | |
StructType | list or tuple | StructType(fields). field is a Seq of StructField. (4) | |
StructField | The value type of the data type of this field (For example, Int for a StructField with the data type IntegerType) | StructField(name, dataType, [nullable]).(4) | |
VariantType | VariantVal | VariantType() | |
Not Supported | Not supported | Not supported |
SQL type | Data type | Value type | API to access or create data type |
---|---|---|---|
ByteType | integer (1) | ‘byte’ | |
ShortType | integer (1) | ‘short’ | |
IntegerType | integer | ‘integer’ | |
LongType | integer (1) | ‘long’ | |
FloatType | numeric (1) | ‘float’ | |
DoubleType | numeric | ‘double’ | |
DecimalType | Not supported | Not supported | |
StringType | character | ‘string’ | |
BinaryType | raw | ‘binary’ | |
BooleanType | logical | ‘bool’ | |
TimestampType | POSIXct | ‘timestamp’ | |
TimestampNTZType | datetime.datetime | TimestampNTZType() | |
DateType | Date | ‘date’ | |
YearMonthIntervalType | Not supported | Not supported | |
DayTimeIntervalType | Not supported | Not supported | |
ArrayType | vector or list | list(type=’array’, elementType=elementType, containsNull=[containsNull]).(2) | |
MapType | environment | list(type=’map’, keyType=keyType, valueType=valueType, valueContainsNull=[valueContainsNull]).(2) | |
StructType | named list | list(type=’struct’, fields=fields). fields is a Seq of StructField. (4) | |
StructField | The value type of the data type of this field (For example, integer for a StructField with the data type IntegerType) | list(name=name, type=dataType, nullable=[nullable]).(4) | |
Not Supported | Not supported | Not supported | |
Not Supported | Not supported | Not supported |
(1) Numbers are converted to the domain at runtime. Make sure that numbers are within range.
(2) The optional value defaults to TRUE
.
(3) Interval types
-
YearMonthIntervalType([startField,] endField)
: Represents a year-month interval which is made up of a contiguous subset of the following fields:startField
is the leftmost field, andendField
is the rightmost field of the type. Valid values ofstartField
andendField
are0(MONTH)
and1(YEAR)
. -
DayTimeIntervalType([startField,] endField)
: Represents a day-time interval which is made up of a contiguous subset of the following fields:startField
is the leftmost field, andendField
is the rightmost field of the type. Valid values ofstartField
andendField
are0(DAY)
,1(HOUR)
,2(MINUTE)
,3(SECOND)
.
(4) StructType
StructType(fields)
Represents values with the structure described by a sequence, list, or array ofStructField
s (fields). Two fields with the same name are not allowed.StructField(name, dataType, nullable)
Represents a field in aStructType
. The name of a field is indicated byname
. The data type of a field is indicated by dataType.nullable
indicates if values of these fields can havenull
values. This is the default.