regexp_replace function
Applies to: Databricks SQL
Databricks Runtime
Replaces all substrings of str that match regexp with rep.
Syntax
regexp_replace(str, regexp, rep [, position] )
Arguments
str: ASTRINGexpression to be matched.regexp: ASTRINGexpression with a matching pattern.rep: ASTRINGexpression which is the replacement string.position: A optional integral numeric literal greater than 0, stating where to start matching. The default is 1.
Returns
A STRING.
The regexp string must be a regular expression. See Regular expressions for the supported syntax.
When using literals, use raw-literal (r prefix) to avoid escape character pre-processing.
Searching starts at position. The default is 1, which marks the beginning of str.
If position exceeds the character length of str, the result is str.
Common error conditions
Examples
Replace every match with fixed text
SQL
> SELECT regexp_replace('100-200', r'(\d+)', 'num');
num-num
Mask digits
SQL
> SELECT regexp_replace('SSN 123-45-6789', r'\d', '#');
SSN ###-##-####
Reorder parts using capture groups
In the replacement string, use $ followed by the group number to refer to a capture group ($1, $2).
SQL
> SELECT regexp_replace('2024-03-15', r'(\d{4})-(\d{2})-(\d{2})', '$3/$2/$1');
15/03/2024
Collapse repeated whitespace
SQL
> SELECT regexp_replace('too many spaces', r'\s+', ' ');
too many spaces
Invalid regex pattern
SQL
> SELECT regexp_replace('abc', '[invalid', 'x');
Error: INVALID_PARAMETER_VALUE.PATTERN