bucket
Partition transform function: A transform for any type that partitions by a hash of the input column. Supports Spark Connect.
warning
Deprecated in 4.0.0. Use partitioning.bucket instead.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.bucket(numBuckets=<numBuckets>, col=<col>)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| The number of buckets. |
|
| Target date or timestamp column to work on. |
Returns
pyspark.sql.Column: Data partitioned by given columns.
Examples
Python
df.writeTo("catalog.db.table").partitionedBy(
bucket(42, "ts")
).createOrReplace()
note
This function can be used only in combination with the partitionedBy method of the DataFrameWriterV2.