levenshtein
Computes the Levenshtein distance of the two given strings.
For the corresponding Databricks SQL function, see levenshtein function.
Syntax
Python
from pyspark.databricks.sql import functions as dbf
dbf.levenshtein(left=<left>, right=<right>, threshold=<threshold>)
Parameters
Parameter | Type | Description |
|---|---|---|
|
| First column value. |
|
| Second column value. |
|
| If set when the levenshtein distance of the two given strings less than or equal to a given threshold then return result distance, or -1 |
Returns
pyspark.sql.Column: Levenshtein distance as integer value.
Examples
Python
from pyspark.databricks.sql import functions as dbf
df = spark.createDataFrame([('kitten', 'sitting',)], ['l', 'r'])
df.select('*', dbf.levenshtein('l', 'r')).show()
Python
df.select('*', dbf.levenshtein(df.l, df.r, 2)).show()