Skip to main content

unpivot

Unpivot a DataFrame from wide format to long format, optionally leaving identifier columns set. This is the reverse to groupBy(...).pivot(...).agg(...), except for the aggregation, which cannot be reversed.

Added in Databricks Runtime 11.1

Syntax

unpivot(ids: Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]], values: Optional[Union["ColumnOrName", List["ColumnOrName"], Tuple["ColumnOrName", ...]]], variableColumnName: str, valueColumnName: str)

Parameters

Parameter

Type

Description

ids

str, Column, tuple, list

Column(s) to use as identifiers. Can be a single column or column name, or a list or tuple for multiple columns.

values

str, Column, tuple, list, optional

Column(s) to unpivot. Can be a single column or column name, or a list or tuple for multiple columns. If specified, must not be empty. If not specified, uses all columns that are not set as ids.

variableColumnName

str

Name of the variable column.

valueColumnName

str

Name of the value column.

Returns

DataFrame: Unpivoted DataFrame.

Notes

Supports Spark Connect.

Examples

Python
df = spark.createDataFrame(
[(1, 11, 1.1), (2, 12, 1.2)],
["id", "int", "double"],
)
df.show()
# +---+---+------+
# | id|int|double|
# +---+---+------+
# | 1| 11| 1.1|
# | 2| 12| 1.2|
# +---+---+------+

from pyspark.sql import functions as sf
df.unpivot(
"id", ["int", "double"], "var", "val"
).sort("id", sf.desc("var")).show()
# +---+------+----+
# | id| var| val|
# +---+------+----+
# | 1| int|11.0|
# | 1|double| 1.1|
# | 2| int|12.0|
# | 2|double| 1.2|
# +---+------+----+