make_valid_utf8 function

Applies to: check marked yes Databricks Runtime 15.4 and later

Returns a string in which all invalid UTF-8 byte sequences in strExpr, are replaced by the Unicode replacement character (U+FFFD).

Syntax

make_valid_utf8(strExpr)

Arguments

  • strExpr: A STRING expression.

Returns

A STRING, consisting of a valid UTF-8 byte sequencea.

Examples

 Simple example taking a valid string as input.
> SELECT make_valid_utf8('Spark')
  Spark

 Simple example taking a valid collated string as input.
> SELECT make_valid_utf8('SQL' COLLATE UTF8_LCASE)
  SQL

 Simple example taking a valid hexadecimal string as input.
> SELECT make_valid_utf8(x'61')
  a

 Example taking an invalid hexadecimal string as input (illegal UTF-8 byte sequence).
> SELECT make_valid_utf8(x'80')
  

- Example taking an invalid hexadecimal string as input (illegal UTF-8 byte sequence).
> SELECT make_valid_utf8(x'61C262')
  ab