Databricks AI features trust and safety

Databricks understands the importance of your data and the trust you place in us when you use our platform and Databricks AI features. Databricks is committed to the highest standards of data protection, and has implemented rigorous measures to ensure information you submit to Databricks AI features is protected.

Your data remains confidential.
- Databricks does not train generative foundation models with data you submit to these features, and Databricks does not use this data to generate suggestions displayed for other customers.
- Our model partners do not retain data you submit through these features, even for abuse monitoring. Our Partner-powered AI assistive features use zero data retention endpoints from our model partners.
Protection from harmful output. Databricks also uses Azure OpenAI content filtering to protect users from harmful content. In addition, Databricks has performed an extensive evaluation with thousands of simulated user interactions to ensure that the protections put in place to protect against harmful content, jailbreaks, insecure code generation, and use of third-party copyright content are effective.
Databricks uses only the data necessary to provide the service. Data is sent only when you interact with Databricks AI-powered features. Databricks sends your prompt, relevant table metadata and values, errors, as well as input code or queries to help return more relevant results. Databricks does not send other row-level data to third party models.
Data is protected in transit. All traffic between Databricks and model partners is encrypted in transit with industry standard TLS encryption.
Databricks offers data residency controls. Databricks AI-powered features are Designated Services and comply with data residency boundaries. For more details, see Databricks Geos: Data residency and Databricks Designated Services.

To learn about Databricks Assistant privacy, see Privacy and security FAQ.

Features governed by the Partner-powered AI assistive features setting

Partner-powered AI refers to features powered by Azure OpenAI service. Many Databricks AI features are powered by Azure OpenAI service, but not all. Refer to the following table to learn more:

Feature	Where is the model hosted?	Controlled by Partner-Powered AI setting?
Databricks Assistant chat	Azure OpenAI service (Optional: Databricks-hosted)	Yes
Quick fix	Azure OpenAI service (Optional: Databricks-hosted)	Yes
AI-generated UC comments (Compliance security profile (CSP) workspaces)	Azure OpenAI service (Optional: Databricks-hosted)	Yes, for all CSP workspaces
AI-generated UC comments (Non-CSP workspaces)	Databricks-hosted model	No, for non-CSP workspaces
AI/BI dashboard AI-assisted visualizations and companion Genie spaces	Azure OpenAI service	Yes
Genie	Azure OpenAI service	Yes
Databricks Inline Assistant	Azure OpenAI service (Optional: Databricks-hosted)	Yes
Databricks Assistant Autocomplete	Databricks-hosted model (Optional: Databricks-hosted)	No
LLMs as judges	Azure OpenAI service	Yes

Use a Databricks-hosted model

Preview

This feature is in Public Preview.

Learn about using a Databricks-hosted model to power Databricks AI-powered features that would otherwise be powered by Azure OpenAI. This section explains how it works.

How it works

When Databricks AI features use Databricks-hosted models, they use Meta Llama 3 or other models that are available for commercial use. See information about licensing and use of generative AI models.

The following diagram provides an overview of how a Databricks-hosted model powers Databricks AI-powered features such as Quick Fix.

Diagram of the workflow for Databricks Assistant powered by a Databricks-hosted model.

A user executes a notebook cell, which results in an error.
Databricks attaches metadata to a request and sends it to a Databricks-hosted large-language model (LLM). All data is encrypted at rest. Customers can use a customer-managed key (CMK).
The Databricks-hosted model responds with the suggested code edits to fix the error, which is displayed to the user.

This feature is in public preview and is subject to change. Reach out to your representative to ask which Databricks AI features can be supported by a Databricks-hosted model.

FAQ about Databricks-hosted models for AI assistive features

Can I have my own private model serving instance?

Not at this time. This preview uses Model serving endpoints that are managed and secured by Databricks. The model serving endpoints are stateless, protected through multiple layers of isolation and implement the following security controls to protect your data:

Every customer request to Model Serving is logically isolated, authenticated, and authorized.
Mosaic AI Model Serving encrypts all data at rest (AES-256) and in transit (TLS 1.2+).

Can I bring my own API key for my model or host my own models?

Not at this time. The Databricks Assistant is fully managed and hosted by Databricks. Assistant functionality is heavily dependent on model serving features (for example, function calling), performance, and quality. Databricks continuously evaluates new models for the best performance and may update the model in future versions of this feature.

Who owns the output data? If Assistant generates code, who owns that IP?

The customer owns their own output.

Opt out of using Databricks-hosted models

To opt out of using Databricks-hosted models:

Click your username in the top bar of the Databricks workspace.
From the menu, select Previews.
Turn off Use Assistant with Databricks-hosted models.

To learn more about managing previews, see Manage Databricks Previews.

Databricks Assistant privacy and security FAQ

What data is sent to the models?

Databricks Assistant sends your prompt (for example, your question or code) as well as relevant metadata to the model powering the feature on each API request. This helps return more relevant results for your data. Examples include:

Code and queries in the current notebook cell or SQL editor tab
Table and column names and descriptions
Previous questions
Favorite tables

Does the metadata sent to the models respect the user's Unity Catalog permissions?

Yes, all of the data sent to the model respects the user's Unity Catalog permissions, so it does not send metadata relating to tables that the user does not have permission to see.

If I execute a query with results, and then ask a question, do the results of my query get sent to the model?

No, query results are not used by Assistant.

No. Interactions with Assistant are visible only to the user who initiated them.

Does Databricks Assistant execute dangerous code?

No. Databricks Assistant does not automatically run code on your behalf. AI models can make mistakes, misunderstand intent, and hallucinate or give incorrect answers. Review and test AI-generated code before you run it.

Has Databricks done any assessment to evaluate the accuracy and appropriateness of the Assistant responses?

Yes, Databricks has done extensive testing of all of our AI-powered features based on their expected use cases and using simulated user inputs to increase the accuracy and appropriateness of responses. That said, generative AI is an emerging technology, and Assistant may provide inaccurate or inappropriate responses. Databricks has also put in place mitigations to prevent the Assistant from generating harmful responses such as hate speech, insecure code, and prompt jailbreaks.

Features governed by the Partner-powered AI assistive features setting​

Use a Databricks-hosted model​

How it works​

FAQ about Databricks-hosted models for AI assistive features​

Can I have my own private model serving instance?​

Can I bring my own API key for my model or host my own models?​

Who owns the output data? If Assistant generates code, who owns that IP?​

Opt out of using Databricks-hosted models​

Databricks Assistant privacy and security FAQ​

What data is sent to the models?​

Does the metadata sent to the models respect the user's Unity Catalog permissions?​

If I execute a query with results, and then ask a question, do the results of my query get sent to the model?​

If I share my notebook or query with another internal user, can they see my chat history?​

Does Databricks Assistant execute dangerous code?​

Has Databricks done any assessment to evaluate the accuracy and appropriateness of the Assistant responses?​