Skip to main content

substr (Column)

Return a substring of the column.

Syntax

Python
substr(startPos, length)

Parameters

Parameter

Type

Description

startPos

int or Column

Starting position (1-based)

length

int or Column

Length of the substring

Returns

Column

Examples

Example 1: Using integers for the input arguments.

Python
df = spark.createDataFrame(
[(2, "Alice"), (5, "Bob")], ["age", "name"])
df.select(df.name.substr(1, 3).alias("col")).collect()
Output
# [Row(col='Ali'), Row(col='Bob')]

Example 2: Using columns for the input arguments.

Python
df = spark.createDataFrame(
[(3, 4, "Alice"), (2, 3, "Bob")], ["sidx", "eidx", "name"])
df.select(df.name.substr(df.sidx, df.eidx).alias("col")).collect()
Output
# [Row(col='ice'), Row(col='ob')]