Skip to main content

PySpark data types

This page provides a list of PySpark data types available on Databricks with links to corresponding reference documentation.

Data type

Description

ArrayType(elementType[, containsNull])

Array data type

BinaryType()

Binary (byte array) data type

BooleanType()

Boolean data type

ByteType()

Byte data type, representing signed 8-bit integers

CalendarIntervalType()

Calendar intervals

CharType(length)

Char data type

DataType()

Base class for data types

DateType()

Date (datetime.date) data type

DayTimeIntervalType([startField, endField])

DayTimeIntervalType (datetime.timedelta)

DecimalType([precision, scale])

Decimal (decimal.Decimal) data type

DoubleType()

Double data type, representing double precision floats

FloatType()

Float data type, representing single precision floats

Geography (Databricks only)

Geography data type

Geometry (Databricks only)

Geometry data type

IntegerType()

Int data type, representing signed 32-bit integers

LongType()

Long data type, representing signed 64-bit integers

MapType(keyType, valueType[, valueContainsNull])

Map data type

NullType()

Null type

ShortType()

Short data type, representing signed 16-bit integers

StringType([collation])

String data type

StructField(name, dataType[, nullable, metadata])

A field in StructType

StructType([fields])

Struct type, consisting of a list of StructField

TimestampType()

Timestamp (datetime.datetime) data type

TimestampNTZType()

Timestamp (datetime.datetime) data type without timezone information

VarcharType(length)

Varchar data type

VariantType()

Variant data type, representing semi-structured values

YearMonthIntervalType([startField, endField])

YearMonthIntervalType, represents year-month intervals of the SQL standard