Skip to main content

pandas_api

Converts the existing DataFrame into a pandas-on-Spark DataFrame.

Syntax

pandas_api(index_col: Optional[Union[str, List[str]]] = None)

Parameters

Parameter

Type

Description

index_col

str or list of str, optional

Index column of table in Spark.

Returns

PandasOnSparkDataFrame

Notes

If a pandas-on-Spark DataFrame is converted to a Spark DataFrame and then back to pandas-on-Spark, it will lose the index information and the original index will be turned into a normal column.

This is only available if Pandas is installed and available.

Examples

Python
df = spark.createDataFrame(
[(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])

df.pandas_api()
# age name
# 0 14 Tom
# 1 23 Alice
# 2 16 Bob

df.pandas_api(index_col="age")
# name
# age
# 14 Tom
# 23 Alice
# 16 Bob