regexp_extract_all
Extract all strings in the str that match the Java regex regexp and corresponding to the regex group index.
For the corresponding Databricks SQL function, see regexp_extract_all function.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.regexp_extract_all(str=<str>, regexp=<regexp>, idx=<idx>)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| target column to work on. |
|
| regex pattern to apply. |
|
| matched group id. |
Examples
Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([("100-200, 300-400", r"(\d+)-(\d+)")], ["str", "regexp"])
df.select('*', dbf.regexp_extract_all('str', dbf.lit(r'(\d+)-(\d+)'))).show()
df.select('*', dbf.regexp_extract_all('str', dbf.lit(r'(\d+)-(\d+)'), dbf.lit(1))).show()
df.select('*', dbf.regexp_extract_all('str', dbf.lit(r'(\d+)-(\d+)'), 2)).show()
df.select('*', dbf.regexp_extract_all('str', dbf.col("regexp"))).show()
df.select('*', dbf.regexp_extract_all(dbf.col('str'), "regexp")).show()