regexp_instr
Returns the position of the first substring in the str that match the Java regex regexp and corresponding to the regex group index.
For the corresponding Databricks SQL function, see regexp_instr function.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.regexp_instr(str=<str>, regexp=<regexp>, idx=<idx>)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| target column to work on. |
|
| regex pattern to apply. |
|
| matched group id. |
Examples
Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([("1a 2b 14m", r"\d+(a|b|m)")], ["str", "regexp"])
Python
df.select('*', dbf.regexp_instr('str', dbf.lit(r'\d+(a|b|m)'))).show()
df.select('*', dbf.regexp_instr('str', dbf.lit(r'\d+(a|b|m)'), dbf.lit(1))).show()
df.select('*', dbf.regexp_instr('str', dbf.col("regexp"))).show()
df.select('*', dbf.regexp_instr(dbf.col("str"), "regexp")).show()