Skip to main content

substring_index

Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delim.

For the corresponding Databricks SQL function, see substring_index function.

Syntax

Python
from pyspark.databricks.sql import functions as dbf

dbf.substring_index(str=<str>, delim=<delim>, count=<count>)

Parameters

Parameter

Type

Description

str

pyspark.sql.Column or str

target column to work on.

delim

literal string

delimiter of values.

count

int

number of occurrences.

Returns

pyspark.sql.Column: substring of given value.

Examples

Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('a.b.c.d',)], ['s'])
df.select('*', dbf.substring_index(df.s, '.', 2)).show()
df.select('*', dbf.substring_index('s', '.', -3)).show()