Skip to main content

substring

Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type.

The position is not zero based, but 1 based index.

For the corresponding Databricks SQL function, see substring function.

Syntax

Python
from pyspark.databricks.sql import functions as dbf

dbf.substring(str=<str>, pos=<pos>, len=<len>)

Parameters

Parameter

Type

Description

str

pyspark.sql.Column or str

target column to work on.

pos

pyspark.sql.Column or str or int

starting position in str.

len

pyspark.sql.Column or str or int

length of chars.

Returns

pyspark.sql.Column: substring of given value.

Examples

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('abcd',)], ['s',])
df.select('*', dbf.substring(df.s, 1, 2)).show()
df = spark.createDataFrame([('Spark', 2, 3)], ['s', 'p', 'l'])
df.select('*', dbf.substring(df.s, 2, df.l)).show()
df.select('*', dbf.substring(df.s, df.p, 3)).show()
df.select('*', dbf.substring(df.s, df.p, df.l)).show()
df = spark.createDataFrame([('Spark', 2, 3)], ['s', 'p', 'l'])
df.select('*', dbf.substring(df.s, 2, 'l')).show()
df.select('*', dbf.substring('s', 'p', 'l')).show()