Unity AI Gateway
This page covers the new AI Gateway (visible in the left nav of the UI), which is currently in Beta. Account admins can enable access to this feature in the account console Previews page. See Manage Databricks previews.
For details on the previous version of AI Gateway (not Unity AI Gateway), see AI Gateway for serving endpoints (legacy).
Unity AI Gateway is the Databricks governance solution for enterprise AI. Built on Unity Catalog, it extends governance beyond your data and AI assets to the runtime interactions between models, agents, MCP servers, and tools. Use it to:
- Control which AI services teams can use: Register Databricks-hosted and external models, MCP services, and agents in Unity Catalog, then grant access with standard Unity Catalog privileges. Databricks provides foundation model services out of the box, plus managed MCP services for apps like Google Drive, Jira, Slack, and GitHub.
- Route and manage AI traffic centrally: Route requests, set rate limits, configure fallbacks, and manage budgets across models and MCP services. Hard spend caps stop requests when a budget is reached, rather than alerting after the fact.
- Set guardrails and access policies: Attach service policies to allow, deny, require approval for, or transform individual requests and responses. Built-in policies protect against PII exposure, prompt injection, and unsafe content.
- Monitor usage, cost, and risk: Track who uses which services, how much they spend, and what happened during each request, with unified agent tracing across model and MCP activity.
New to AI governance on Databricks? See Get started with AI governance for an end-to-end setup path.
Control which AI services teams can use
Unity Catalog manages AI assets as securable objects. Register them once, then grant and revoke access using the same privileges you use for tables and volumes:
- Models: Registered ML models in Unity Catalog, including hosted foundation models, which are Databricks-hosted foundation models available through Foundation Model APIs. See Manage model lifecycle and foundation model Unity Catalog permissions.
- MCP tools: MCP servers registered as Unity Catalog securable objects, with tool filtering and service policies. See Connect agents to third-party tools with MCP Services.
- Agents: AI agents registered as Unity Catalog securable objects and governed alongside your tables, models, and functions.
- Connections: Unity Catalog HTTP connections used to access external APIs and MCP servers. See HTTP connections.
- Functions: Unity Catalog functions used as agent tools or for data transformations. See Create AI agent tools using Unity Catalog functions.
To define and share model services as Unity Catalog securable objects across workspaces, see Create custom model services.
Route and manage AI traffic
Unity AI Gateway routes requests to your model services and MCP services from a central control plane. Manage capacity, availability, and spend across Databricks-hosted and external providers:
Capability | Description |
|---|---|
Enforce consumption limits on model services and MCP services to manage capacity and cost. | |
Distribute requests across multiple model backends and add failover to increase availability. | |
Monitor spend and set per-user thresholds and hard caps across Databricks-hosted and external providers. |
Unity AI Gateway features don't incur charges during Beta.
Set guardrails and access policies
Unity Catalog privilege grants determine whether a principal can call an AI service. Service policies govern how that interaction proceeds, based on the content of the request and response and on who is making the call. This matters most when agents act on behalf of users and reach external systems.
A service policy is a type of attribute-based access control (ABAC) policy scoped to AI services. You can allow, deny, or require human approval for an interaction, or transform request and response content — for example, redacting personally identifiable information (PII) with a built-in policy such as system.ai.mask_pii. See Service policies for AI securables and Create and attach a service policy.
Monitor usage, cost, and risk
Track activity, spend, and outcomes across all Unity AI Gateway services:
Capability | Description |
|---|---|
Track requests, token usage, and latency for model services using system tables. | |
Attribute Databricks cost to services, target models, principals, and tags. | |
Log requests and responses to Unity Catalog Delta tables for monitoring and debugging. |
How it works together
Unity AI Gateway builds on Unity Catalog governance across three layers:
- AI assets: Unity Catalog manages models, functions, connections, and services as securable objects, governed with standard Unity Catalog privileges. Services include model services, agent services, and MCP services.
- AI traffic: Unity AI Gateway is the control plane that provides central control for all AI services, including foundation models, tools, and agents.
- AI service behavior: Service policies govern the content of requests and responses to AI services, based on who is calling and what is sent.
For a conceptual overview of AI governance in Unity Catalog, see AI governance in Unity Catalog.
Model serving endpoints (previous)
The previous version of AI Gateway provides governance features for model serving endpoints at the workspace level, including external model endpoints, Foundation Model API endpoints, and custom model endpoints.
Topic | Description |
|---|---|
Learn about AI Gateway features for serving endpoints, including supported features and limitations. | |
Configure AI Gateway features such as usage tracking, payload logging, rate limits, and guardrails on a model serving endpoint. | |
Monitor served models using AI Gateway-enabled inference tables (legacy) | Monitor served models using AI Gateway-enabled inference tables. |
Apply LLM-based guardrails to inspect requests and responses and block or sanitize content that violates your policies. |