Column class
A column in a DataFrame.
Supports Spark Connect
Syntax
Methods
Method | Description |
|---|---|
Returns this column aliased with a new name or names (in the case of expressions that return more than one column, such as explode). | |
Returns a sort expression based on the ascending order of the column. | |
Returns a sort expression based on ascending order of the column, and null values return before non-null values. | |
Returns a sort expression based on ascending order of the column, and null values appear after non-null values. | |
Alias for | |
Check if the current column's values are between the specified lower and upper bounds, inclusive. | |
Compute bitwise AND of this expression with another expression. | |
Compute bitwise OR of this expression with another expression. | |
Compute bitwise XOR of this expression with another expression. | |
Casts the column into type | |
Contains the other element. | |
Returns a sort expression based on the descending order of the column. | |
Returns a sort expression based on the descending order of the column, and null values appear before non-null values. | |
Returns a sort expression based on the descending order of the column, and null values appear after non-null values. | |
An expression that drops fields in StructType by name. | |
String ends with. | |
Equality test that is safe for null values. | |
An expression that gets a field by name in a StructType. | |
An expression that gets an item at position ordinal out of a list, or gets an item by key out of a dict. | |
SQL ILIKE expression (case insensitive LIKE). | |
True if the current expression is NaN. | |
True if the current expression is NOT null. | |
True if the current expression is null. | |
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments. | |
SQL like expression. | |
Alias for | |
Evaluates a list of conditions and returns one of multiple possible result expressions. | |
Define a windowing column. | |
SQL RLIKE expression (LIKE with Regex). | |
String starts with. | |
Return a Column which is a substring of the column. | |
This is a special version of | |
Evaluates a list of conditions and returns one of multiple possible result expressions. | |
An expression that adds/replaces a field in StructType by name. |
Operators
The Column class supports standard Python operators for arithmetic, comparison, and logical operations:
- Arithmetic:
+,-,*,/,%,** - Comparison:
==,!=,<,<=,>,>= - Logical:
&(AND),|(OR),~(NOT)
Examples
For more simple examples that demonstrate usage of columns, see Column operations.
Create Column instances
Select a column from a DataFrame:
df = spark.createDataFrame(
[(2, "Alice"), (5, "Bob")], ["age", "name"])
# Access by attribute
df.name
# Column<'name'>
# Access by bracket notation
df["name"]
# Column<'name'>
Create a column from an expression:
df.age + 1
# Column<...>
1 / df.age
# Column<...>
Basic column operations
# Arithmetic operations
df.select(df.age + 10).show()
# Comparison operations
df.filter(df.age > 3).show()
# String operations
df.filter(df.name.startswith("A")).show()
# Null checking
df.filter(df.name.isNotNull()).show()
Conditional logic
from pyspark.sql import functions as F
df.select(
F.when(df.age < 3, "child")
.when(df.age < 13, "kid")
.otherwise("adult")
.alias("age_group")
).show()
Sorting
df.orderBy(df.age.desc()).show()
df.orderBy(df.age.asc_nulls_last()).show()