search function
Applies to: Databricks Runtime 19.0 and above
This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.
Returns true if search_pattern is found in any of the supplied target column values.
Syntax
search ( column [, ...], search_pattern [, mode => mode] )
Arguments
-
column: One or more searchable target expressions. A searchable expression has one of these types:STRINGwithUTF8_BINARYcollation.VARIANT.STRUCTwith at least one searchable field. The struct is automatically expanded to its searchable leaf fields at any nesting depth.ARRAYof any searchable type.
You can pass
*as the column list to expand to all searchable columns in the input. There is no implicit cast for non-string target values. -
search_pattern: A foldable (constant)STRINGexpression with the value to search for. Formode => 'ip', the value must be a valid IPv4 address or IPv4 CIDR block in CIDR notation. -
mode: An optional namedSTRINGargument that controls how matches are performed. One of:'substring'(default): Matches ifsearch_patternappears anywhere within a target value. Equivalent tocontainsfunction.'word': Matches the individual words insearch_patternagainst a target value, regardless of order.'ip': Whensearch_patternis a single IPv4 address (for example,'192.168.1.0'), matches if the address appears anywhere within a target value. Whensearch_patternis an IPv4 CIDR block (for example,'192.168.1.0/24'), matches if the target value contains an IPv4 address that belongs to the block.
Returns
A BOOLEAN.
trueifsearch_patternis found in any of the target column values.NULLifsearch_patternis not found in any target column value and at least one of those values isNULL.falseotherwise.
Notes
isearchfunction is the case-insensitive variant ofsearchwith otherwise identical behavior.- For a
STRUCTargument, the function searches every searchable leaf field reachable from the top-level struct, regardless of nesting depth. The same expansion applies recursively toSTRUCTfields withinVARIANTandARRAYvalues. - The
*wildcard expands to all searchable columns in the input. If no input column is searchable, the wildcard expansion fails with SEARCH_REQUIRES_SEARCHABLE_COLUMNS.STAR_EXPANDED_TO_NONE. - In
'substring'mode,VARIANTkeys and non-string scalar values inside aVARIANTare not matched. Use'word'mode or extract individual fields to search those.
Common error conditions
- SEARCH_REQUIRES_SEARCHABLE_COLUMNS
- SEARCH_REQUIRES_STRING_LITERALS_ARGUMENTS
- SEARCH_INVALID_IP_PATTERN
Examples
-- Basic examples.
> SELECT search(column, 'needle', mode => 'substring') FROM VALUES ('Needle') AS table(column);
false
> SELECT search(column, 'quick fox', mode => 'substring') FROM VALUES ('quick brown fox') AS table(column);
false
> SELECT search(column, lower('NEEDLE')) FROM VALUES ('needle') AS table(column);
true
> SELECT search(column, 'quick fox', mode => 'word') FROM VALUES ('quick brown fox') AS table(column);
true
> SELECT search(column, 'qui fox', mode => 'word') FROM VALUES ('quick fox') AS table(column);
false
> SELECT search(column, '10.0.1.1', mode => 'ip') FROM VALUES ('10.0.0.1') AS table(column);
false
> SELECT search(column, '10.0.0.0/24', mode => 'ip') FROM VALUES ('10.0.0.1') AS table(column);
true
-- Search across multiple columns with mixed types using the `*` wildcard.
> CREATE TABLE test_table (
int_column INT, -- not searchable
array_column ARRAY<STRING>,
variant_column VARIANT,
string_column STRING)
USING DELTA;
> SELECT * FROM test_table WHERE search(int_column, 'needle');
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE]
> SELECT * FROM test_table WHERE search(*, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(array_column, variant_column, string_column, 'needle');
-- Automatic expansion of STRUCT columns to their searchable leaf fields.
> CREATE TABLE test_table (
usage_stats STRUCT<digits: INT>,
customer_info STRUCT<
contact: STRUCT<json_field: VARIANT>,
name: STRING>)
USING DELTA;
> SELECT * FROM test_table WHERE search(usage_stats.digits, 'needle');
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE]
> SELECT * FROM test_table WHERE search(customer_info, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(customer_info.contact.json_field, customer_info.name, 'needle');
> SELECT * FROM test_table WHERE search(customer_info.contact, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(customer_info.contact.json_field, 'needle');
-- VARIANT behavior: substring mode does not match keys or non-string scalar values.
> CREATE TABLE test_table AS
SELECT parse_json('{
"role": "user",
"id": 101,
"preferences": {"theme": "dark", "language": "en"}
}') AS column;
> SELECT search(column, 'user', mode => 'substring') FROM test_table;
true
> SELECT search(column, 'preferences', mode => 'substring') FROM test_table;
false
> SELECT search(column, '101', mode => 'substring') FROM test_table;
false
-- NULL behavior.
> SELECT search(NULL, 'needle', mode => 'substring');
NULL
> SELECT search(CAST(NULL AS STRING), 'needle', mode => 'substring');
NULL
> SELECT search('needle in haystack', NULL, mode => 'substring');
[SEARCH_REQUIRES_STRING_LITERALS_ARGUMENTS]