Vector Search filtering guide
This page describes the filter expression syntax for Vector Search queries. The filter syntax is different for standard and storage-optimized endpoints.
- Standard endpoints use a Python dictionary, where the key encodes both the column name and the operator (for example,
{"price <": 200}). - Storage-optimized endpoints use a SQL-like filter string similar to a
WHEREclause (for example,"price < 200").
For a complete reference of supported operators and where to find filters in the query API, see Use filters on queries. For example notebooks, see Example notebooks.
Standard endpoints
Standard endpoints accept a Python dictionary passed to similarity_search() through the filters parameter. The key encodes both the column name and the operator (for example, {"price >": 40000}), and multiple keys in the same dict are combined with AND logic. This section summarizes the supported operators and known limitations.
Quick reference
String columns
Operator | Syntax | Example |
|---|---|---|
Exact match |
|
|
Negation |
|
|
OR (multiple values) |
|
|
Token-based |
|
|
Hyphenated values |
|
|
JSON pattern |
|
|
Numeric columns (INT, DOUBLE)
Operator | Syntax | Example |
|---|---|---|
Greater than |
|
|
Less than or equal |
|
|
Greater than or equal |
|
|
Integer match |
|
|
Range | Two keys in dict |
|
Boolean columns
Operator | Syntax | Example |
|---|---|---|
Match true |
|
|
Match false |
|
|
Array columns
Standard endpoints support primitive ARRAY types: ARRAY<STRING>, ARRAY<INT>, ARRAY<BIGINT>, ARRAY<SMALLINT>, ARRAY<TINYINT>, ARRAY<FLOAT>, ARRAY<DOUBLE>, ARRAY<BOOLEAN>, ARRAY<DATE>, and ARRAY<TIMESTAMP>. ARRAY<STRUCT> is not supported.
Operator | Syntax | Example |
|---|---|---|
Contains a value |
|
|
Contains any value (OR) |
|
|
AND with another column |
|
|
Timestamp columns (native TIMESTAMP or DATE)
Operator | Syntax | Example |
|---|---|---|
After |
|
|
Exact timestamp match |
|
|
Negation |
|
|
Range | Two keys in dict |
|
Combined filters
Multiple keys in a single dict are combined with AND logic. Use the {"col1 OR col2 op": [v1, v2]} syntax for OR across different fields.
Pattern | Syntax | Example |
|---|---|---|
String + numeric | Multiple keys |
|
Numeric + numeric | Multiple keys |
|
String + numeric + array | Multiple keys |
|
String + array + numeric | Multiple keys |
|
String + numeric + timestamp | Multiple keys |
|
| Combined key |
|
JSON | List of dicts |
|
| List of dicts |
|
Limitations
Limitation | Detail | Workaround |
|---|---|---|
| Creating an index with | Unflatten the struct into a string column for indexing and filtering. |
| Matches whole whitespace-separated tokens, not SQL wildcard patterns ( | Use a storage-optimized endpoint for wildcard |
| Using | Use a native |
| There is no | Use two keys in the dict, for example |
No SQL functions in filters | Functions like | Use a storage-optimized endpoint with a SQL filter string. For timestamps, store as epoch milliseconds. |
JSON numeric filtering not supported | Standard endpoints cannot extract or cast nested JSON values (for example, | Use |
Duplicate dict keys silently dropped | Python keeps only the last value for duplicate keys in a dict. For example, | Use a list of dicts: |
Storage-optimized endpoints
Storage-optimized endpoints accept a SQL-like filter string passed to similarity_search() through the filters parameter. This section summarizes the supported operators and known limitations.
Quick reference
String columns
Operator | Syntax | Example |
|---|---|---|
Exact match |
|
|
Negation |
|
|
OR (multiple values) |
|
|
Wildcard pattern |
|
|
Hyphenated values |
|
|
JSON substring match |
|
|
Numeric columns (INT, DOUBLE)
Operator | Syntax | Example |
|---|---|---|
Greater than |
|
|
Less than or equal |
|
|
Greater than or equal |
|
|
Integer match |
|
|
Range |
|
|
Boolean columns
Operator | Syntax | Example |
|---|---|---|
Boolean true |
|
|
Array columns
Array filtering is not supported on storage-optimized endpoints. ARRAY_CONTAINS raises a BadRequest: Syntax error. As a workaround, concatenate array values into a string column and use LIKE. For example:
"body_type LIKE '%sedan%'""body_type LIKE '%hybrid%' OR body_type LIKE '%electric%'"
Timestamp columns (native TIMESTAMP)
Operator | Syntax | Example |
|---|---|---|
After date |
|
|
Range | Combine with |
|
Combined filters
Pattern | Syntax | Example |
|---|---|---|
String + numeric |
|
|
Numeric + numeric |
|
|
String + numeric + |
|
|
|
|
|
Timestamp + numeric + string |
|
|
Limitations
Limitation | Detail | Workaround |
|---|---|---|
| Creating an index with | Unflatten the struct into a string column for indexing and filtering. |
|
| Concatenate array values into a string column and use |
| Using | Use |
Bare timestamp strings cause type mismatch |
| Wrap the value with |
JSON numeric filtering not supported | SQL functions like | Pre-extract JSON fields into top-level columns at index creation time, or use |
Post-filtering (overfetch) | Results are ranked by relevance first, then filtered. Most cases are handled automatically, but in rare scenarios—mainly | Increase |