Cannot import timestamp_millis or unix_millis example(Scala)

Loading...

The functions timestamp_millis and unix_millis are not available in the DataFrame API.

These functions are specific to SQL and are included in Spark 3.1.1 and above.

If you attempt to importtimestamp_millis or unix_millis, you get an error message.

import org.apache.spark.sql.functions.{timestamp_millis, unix_millis}
command-1862356:1: error: value timestamp_millis is not a member of object org.apache.spark.sql.functions import org.apache.spark.sql.functions.{timestamp_millis, unix_millis} ^

timestamp_millis and unix_millis both work correctly with direct SQL calls.

%sql
SELECT timestamp_millis(1230219000123)
 
timestamp_millis(1230219000123)
1
2008-12-25T15:30:00.123+0000

Showing all 1 rows.

%sql
SELECT unix_millis(TIMESTAMP('1970-01-01 00:00:01Z'));
 
unix_millis(CAST(1970-01-01 00:00:01Z AS TIMESTAMP))
1
1000

Showing all 1 rows.

timestamp_millis and unix_millis return an error message if you try to use them directly on a DataFrame.

import sqlContext.implicits._
val df = Seq(
 (1, "First Value"),
 (2, "Second Value")
).toDF("int_column", "string_column")
 
import org.apache.spark.sql.functions.{unix_millis}
import org.apache.spark.sql.functions.col
df.select(unix_millis(col("int_column"))).show()
command-1862359:7: error: value unix_millis is not a member of object org.apache.spark.sql.functions import org.apache.spark.sql.functions.{unix_millis} ^ command-1862359:9: error: not found: value unix_millis df.select(unix_millis(col("int_column"))).show() ^

You need to use selectExpr() with timestamp_millis or unix_millis if you want to use them with a DataFrame.

import org.apache.spark.sql.functions._
import sqlContext.implicits._
val jdf = Seq(
 (1, "First Value"),
 (2, "Second Value")
).toDF("int_column", "string_column")
import org.apache.spark.sql.functions._ import sqlContext.implicits._ jdf: org.apache.spark.sql.DataFrame = [int_column: int, string_column: string]
display(jdf.selectExpr("timestamp_millis(int_column)"))
 
timestamp_millis(int_column)
1
2
1970-01-01T00:00:00.001+0000
1970-01-01T00:00:00.002+0000

Showing all 2 rows.

import org.apache.spark.sql.functions._
import sqlContext.implicits._
val ldf = Seq(
 (1, "First Value"),
 (2, "Second Value")
).toDF("int_column", "string_column")
import org.apache.spark.sql.functions._ import sqlContext.implicits._ ldf: org.apache.spark.sql.DataFrame = [int_column: int, string_column: string]
display(ldf.selectExpr("unix_millis(int_column)"))
 
timestamp_seconds(int_column)
1
2
1970-01-01T00:00:01.000+0000
1970-01-01T00:00:02.000+0000

Showing all 2 rows.

import org.apache.spark.sql.functions._
import sqlContext.implicits._
val ndf = Seq(
 (1, "First Value"),
 (2, "Second Value")
).toDF("int_column", "string_column")
 
display(ndf.selectExpr("timestamp_millis(int_column)"))
 
timestamp_millis(int_column)
1
2
1970-01-01T00:00:00.001+0000
1970-01-01T00:00:00.002+0000

Showing all 2 rows.