Skip to main content

COLLATION

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 18.3 and above

The COLLATION configuration parameter sets the default collation for the session.

The default collation applies to all string operations for which there is no explicit collation or implicit collation. See Default collation for the complete derivation rules and Collation precedence for how explicit and implicit collations override the default.

You can set this parameter at the session level using the SET COLLATION statement.

Setting

The parameter must be set to a collation_name.

Common collations are:

  • UTF8_BINARY
  • UTF8_LCASE
  • UNICODE
  • UNICODE_CI

For a complete list of supported collations, see Supported collations.

System default

The system default is UTF8_BINARY.

Examples

SQL
> SET COLLATION UNICODE_CI;

-- Use the default (session) collation because c1 has no collation set.
> SELECT * FROM VALUES('hello'), ('Hello') AS T(c1) ORDER BY c1;
hello
Hello

-- The default collation also applies to string literals.
> SELECT 'a' = 'A';
true

-- Reset the default collation back to the system default.
> SET COLLATION UTF8_BINARY;
> SELECT 'a' = 'A';
false