Skip to main content

search function

Applies to: check marked yes Databricks Runtime 19.0 and above

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

Returns true if search_pattern is found in any of the supplied target column values.

Syntax

search ( column [, ...], search_pattern [, mode => mode] )

Arguments

  • column: One or more searchable target expressions. A searchable expression has one of these types:

    • STRING with UTF8_BINARY collation.
    • VARIANT.
    • STRUCT with at least one searchable field. The struct is automatically expanded to its searchable leaf fields at any nesting depth.
    • ARRAY of any searchable type.

    You can pass * as the column list to expand to all searchable columns in the input. There is no implicit cast for non-string target values.

  • search_pattern: A foldable (constant) STRING expression with the value to search for. For mode => 'ip', the value must be a valid IPv4 address or IPv4 CIDR block in CIDR notation.

  • mode: An optional named STRING argument that controls how matches are performed. One of:

    • 'substring' (default): Matches if search_pattern appears anywhere within a target value. Equivalent to contains function.
    • 'word': Matches the individual words in search_pattern against a target value, regardless of order.
    • 'ip': When search_pattern is a single IPv4 address (for example, '192.168.1.0'), matches if the address appears anywhere within a target value. When search_pattern is an IPv4 CIDR block (for example, '192.168.1.0/24'), matches if the target value contains an IPv4 address that belongs to the block.

Returns

A BOOLEAN.

  • true if search_pattern is found in any of the target column values.
  • NULL if search_pattern is not found in any target column value and at least one of those values is NULL.
  • false otherwise.

Notes

  • isearch function is the case-insensitive variant of search with otherwise identical behavior.
  • For a STRUCT argument, the function searches every searchable leaf field reachable from the top-level struct, regardless of nesting depth. The same expansion applies recursively to STRUCT fields within VARIANT and ARRAY values.
  • The * wildcard expands to all searchable columns in the input. If no input column is searchable, the wildcard expansion fails with SEARCH_REQUIRES_SEARCHABLE_COLUMNS.STAR_EXPANDED_TO_NONE.
  • In 'substring' mode, VARIANT keys and non-string scalar values inside a VARIANT are not matched. Use 'word' mode or extract individual fields to search those.

Common error conditions

Examples

SQL
-- Basic examples.
> SELECT search(column, 'needle', mode => 'substring') FROM VALUES ('Needle') AS table(column);
false

> SELECT search(column, 'quick fox', mode => 'substring') FROM VALUES ('quick brown fox') AS table(column);
false

> SELECT search(column, lower('NEEDLE')) FROM VALUES ('needle') AS table(column);
true

> SELECT search(column, 'quick fox', mode => 'word') FROM VALUES ('quick brown fox') AS table(column);
true

> SELECT search(column, 'qui fox', mode => 'word') FROM VALUES ('quick fox') AS table(column);
false

> SELECT search(column, '10.0.1.1', mode => 'ip') FROM VALUES ('10.0.0.1') AS table(column);
false

> SELECT search(column, '10.0.0.0/24', mode => 'ip') FROM VALUES ('10.0.0.1') AS table(column);
true

-- Search across multiple columns with mixed types using the `*` wildcard.
> CREATE TABLE test_table (
int_column INT, -- not searchable
array_column ARRAY<STRING>,
variant_column VARIANT,
string_column STRING)
USING DELTA;

> SELECT * FROM test_table WHERE search(int_column, 'needle');
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE]

> SELECT * FROM test_table WHERE search(*, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(array_column, variant_column, string_column, 'needle');

-- Automatic expansion of STRUCT columns to their searchable leaf fields.
> CREATE TABLE test_table (
usage_stats STRUCT<digits: INT>,
customer_info STRUCT<
contact: STRUCT<json_field: VARIANT>,
name: STRING>)
USING DELTA;

> SELECT * FROM test_table WHERE search(usage_stats.digits, 'needle');
[DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE]

> SELECT * FROM test_table WHERE search(customer_info, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(customer_info.contact.json_field, customer_info.name, 'needle');

> SELECT * FROM test_table WHERE search(customer_info.contact, 'needle');
-- Equivalent to:
> SELECT * FROM test_table
WHERE search(customer_info.contact.json_field, 'needle');

-- VARIANT behavior: substring mode does not match keys or non-string scalar values.
> CREATE TABLE test_table AS
SELECT parse_json('{
"role": "user",
"id": 101,
"preferences": {"theme": "dark", "language": "en"}
}') AS column;

> SELECT search(column, 'user', mode => 'substring') FROM test_table;
true

> SELECT search(column, 'preferences', mode => 'substring') FROM test_table;
false

> SELECT search(column, '101', mode => 'substring') FROM test_table;
false

-- NULL behavior.
> SELECT search(NULL, 'needle', mode => 'substring');
NULL

> SELECT search(CAST(NULL AS STRING), 'needle', mode => 'substring');
NULL

> SELECT search('needle in haystack', NULL, mode => 'substring');
[SEARCH_REQUIRES_STRING_LITERALS_ARGUMENTS]