Skip to main content

from_utc_timestamp

Converts a timestamp that is timezone-agnostic (interpreted as a UTC timestamp) to a timestamp in the given time zone. This is a common function for databases supporting TIMESTAMP WITHOUT TIMEZONE.

However, timestamp in Spark represents number of microseconds from the Unix epoch, which is not timezone-agnostic. So in Spark this function just shifts the timestamp value from UTC timezone to the given timezone.

This function may return an unexpected result if the input is a string with timezone, e.g. 2018-03-13T06:18:23+00:00, because Spark first casts the string to timestamp according to the timezone in the string, and then displays the result by converting the timestamp to a string according to the session local timezone.

For the corresponding Databricks SQL function, see from_utc_timestamp function.

Syntax

Python
from pyspark.databricks.sql import functions as dbf

dbf.from_utc_timestamp(timestamp=<timestamp>, tz=<tz>)

Parameters

Parameter

Type

Description

timestamp

pyspark.sql.Column or str

the column that contains timestamps

tz

pyspark.sql.Column or literal string

A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.

Returns

pyspark.sql.Column: timestamp value represented in given timezone.

Examples

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('1997-02-28 10:30:00', 'JST')], ['ts', 'tz'])
df.select('*', dbf.from_utc_timestamp('ts', 'PST')).show()
df.select('*', dbf.from_utc_timestamp(df.ts, df.tz)).show()