sentences function (Databricks SQL)

Splits str into an array of array of words.

Syntax

sentences(str [, lang, country] )

Arguments

  • str: A STRING expression to be parsed.

  • lang: An optional STRING expression with a language code from ISO 639 Alpha-2 (e.g. ‘DE’) , Alpha-3, or a language subtag of up to 8 characters.

  • country: An optional STRING expression with a country code from ISO 3166 alpha-2 country code or a UN M.49 numeric-3 area code.

Returns

An ARRAY of ARRAY of STRING.

The default for lang is en and country US.

Examples

> SELECT sentences('Hi there! Good morning.');
 [[Hi, there],[Good, morning]]
> SELECT sentences('Hi there! Good morning.', 'en', 'US');
 [[Hi, there],[Good, morning]]