ai_mask
function
Applies to: Databricks SQL Databricks Runtime
Preview
This feature is in Public Preview.
In the preview:
The underlying language model can handle several languages, however these functions are tuned for English.
There is rate limiting for the underlying Foundation Model APIs. See Foundation Model APIs limits to update these limits.
The ai_mask()
function allows you to invoke a state-of-the-art generative AI model to mask specified entities in a given text using SQL. This function uses a chat model serving endpoint made available by Databricks Foundation Model APIs.
Requirements
Important
The underlying models that might be used at this time are licensed under the Apache 2.0 license or Llama 2 community license. Databricks recommends reviewing these licenses to ensure compliance with any applicable terms. If models emerge in the future that perform better according to Databricks’s internal benchmarks, Databricks may change the model (and the list of applicable licenses provided on this page).
Currently, Mixtral-8x7B Instruct is the underlying model that powers these AI functions.
This function is only available on workspaces in AI Functions using Foundation Model APIs supported regions.
This function is not available on Databricks SQL Classic.
Check the Databricks SQL pricing page.
Note
In Databricks Runtime 15.1 and above, this function is supported in Databricks notebooks, including notebooks that are run as a task in a Databricks workflow.
Arguments
content
: ASTRING
expression.labels
: AnARRAY<STRING>
literal. Each element represents a type of information to be masked.
Examples
> SELECT ai_mask(
'John Doe lives in New York. His email is john.doe@example.com.',
array('person', 'email')
);
"[MASKED] lives in New York. His email is [MASKED]."
> SELECT ai_mask(
'Contact me at 555-1234 or visit us at 123 Main St.',
array('phone', 'address')
);
"Contact me at [MASKED] or visit us at [MASKED]"