Use agents on Databricks

This page provides an overview of tools for building, deploying, and using AI agents on Databricks. To learn more about agents, see Agent system design patterns.

- Get started: no-code GenAI
- Try AI Playground for UI-based testing and prototyping.
- Get started: MLflow 3 for GenAI
- Try MLflow for GenAI tracing, evaluation, and human feedback.

Serve and query gen AI large language models (LLMs)

Serve a curated set of gen AI models from LLM providers such as OpenAI and Anthropic and make them available through secure, scalable APIs.

Feature	Description
Foundation Models	Serve gen AI models, including open source and third-party models such as Meta Llama, Anthropic Claude, OpenAI GPT, and more.

Build and deploy enterprise-grade AI agents

Build and deploy your own agents, including tool-calling agents, retrieval-augmented generation apps, and multi-agent systems. For a no-code starting point, use the AI Playground to select an LLM, add tools, and chat with the agent to test its responses before exporting to code.

AI Playground provides a low-code option for agent prototyping.

Feature	Description
AI Playground (no code)	Prototype and test AI agents in a no-code environment. Quickly experiment with agent behaviors and tool integrations before generating code for deployment.
Knowledge Assistant	Build and optimize domain-specific AI chatbots using an intuitive interface.
Build custom agents	Author, deploy, and evaluate agents using Python. Supports agents written with any authoring library, including LangGraph, LangChain, OpenAI, and LlamaIndex. Integrated with MLflow Tracing. Iterate quickly using Databricks Apps. To get started quickly, see Get started with AI agents.
Connect agents to tools	Create agent tools to query structured and unstructured data, run code, or connect to external service APIs.
MCP (Model Context Protocol)	Standardize how agents connect to data and tools with a secure, consistent interface.

Feature	Description
AI Playground (no code)	Prototype and test AI agents in a no-code environment. Quickly experiment with agent behaviors and tool integrations before generating code for deployment.
Knowledge Assistant	Build and optimize domain-specific AI chatbots using an intuitive interface.
Build custom agents	Author, deploy, and evaluate agents using Python. Supports agents written with any authoring library, including LangGraph, LangChain, OpenAI, and LlamaIndex. Integrated with MLflow Tracing. Iterate quickly using Databricks Apps. To get started quickly, see Get started with AI agents.
Connect agents to tools	Create agent tools to query structured and unstructured data, run code, or connect to external service APIs.
MCP (Model Context Protocol)	Standardize how agents connect to data and tools with a secure, consistent interface.

Use external agents

If your agent runs outside Databricks, register it as an Agent Service in Unity Catalog to make it discoverable to your team and govern who can use it.

Feature	Description
Agent Services (Beta)	Register external agents in Unity Catalog. Browse and discover agents across teams from a single view, and control access with the same grants that protect your tables, models, and functions.

Feature	Description
Agent Services (Beta)	Register external agents in Unity Catalog. Browse and discover agents across teams from a single view, and control access with the same grants that protect your tables, models, and functions.

Evaluate, debug, and optimize agents

Track agent performance, collect feedback, and drive quality improvements with evaluation and tracing tools.

Feature	Description
MLflow Tracing	Use MLflow Tracing for end-to-end observability. Log every step your agent takes to debug, monitor, and audit agent behavior in development and production.
Agent Evaluation	Use Agent Evaluation and MLflow to measure quality, cost, and latency. Collect feedback from stakeholders and subject matter experts through built-in review apps and use LLM judges to identify and resolve quality issues.
Monitor agents	Use the same evaluation configuration (LLM judges and custom metrics) in offline evaluation and online monitoring.

Feature	Description
MLflow Tracing	Use MLflow Tracing for end-to-end observability. Log every step your agent takes to debug, monitor, and audit agent behavior in development and production.
Agent Evaluation	Use Agent Evaluation and MLflow to measure quality, cost, and latency. Collect feedback from stakeholders and subject matter experts through built-in review apps and use LLM judges to identify and resolve quality issues.
Monitor agents	Use the same evaluation configuration (LLM judges and custom metrics) in offline evaluation and online monitoring.

Serve and query gen AI large language models (LLMs)​

Build and deploy enterprise-grade AI agents​

Use external agents​

Evaluate, debug, and optimize agents​

Serve and query gen AI large language models (LLMs)

Build and deploy enterprise-grade AI agents

Use external agents

Evaluate, debug, and optimize agents