partitioning.bucket

A transform for any type that partitions by a hash of the input column.

note

This function can be used only in combination with DataFrameWriterV2.partitionedBy method.

Syntax

Python
from pyspark.sql.functions import partitioning

partitioning.bucket(numBuckets, col)

Parameters

Parameter	Type	Description
`numBuckets`	`pyspark.sql.Column` or int	The number of buckets.
`col`	`pyspark.sql.Column` or str	Target date or timestamp column to work on.

Examples

Python
from pyspark.sql.functions import partitioning
df.writeTo("catalog.db.table").partitionedBy(
    partitioning.bucket(42, "ts")
).createOrReplace()

Syntax​

Parameters​

Examples​

Syntax

Parameters

Examples