Skip to main content

asTable

Converts the DataFrame into a TableArg object, which can be used as a table argument in a TVF (Table-Valued Function) including UDTF (User-Defined Table Function).

Syntax

asTable()

Returns

TableArg: A TableArg object representing a table argument.

Notes

After obtaining a TableArg from a DataFrame using this method, you can specify partitioning and ordering for the table argument by calling methods such as partitionBy, orderBy, and withSinglePartition on the TableArg instance.

Examples

Python
from pyspark.sql.functions import udtf

@udtf(returnType="id: int, doubled: int")
class DoubleUDTF:
def eval(self, row):
yield row["id"], row["id"] * 2

df = spark.createDataFrame([(1,), (2,), (3,)], ["id"])

result = DoubleUDTF(df.asTable())
result.show()
# +---+-------+
# | id|doubled|
# +---+-------+
# | 1| 2|
# | 2| 4|
# | 3| 6|
# +---+-------+

df2 = spark.createDataFrame(
[(1, "a"), (1, "b"), (2, "c"), (2, "d")], ["key", "value"]
)

@udtf(returnType="key: int, value: string")
class ProcessUDTF:
def eval(self, row):
yield row["key"], row["value"]

result2 = ProcessUDTF(df2.asTable().partitionBy("key").orderBy("value"))
result2.show()
# +---+-----+
# |key|value|
# +---+-----+
# | 1| a|
# | 1| b|
# | 2| c|
# | 2| d|
# +---+-----+