Introduction to building gen AI apps on Databricks

Mosaic AI provides a comprehensive platform to build, deploy, and manage GenAI applications. This article guides you through the essential components and processes involved in developing GenAI applications on Databricks.

Deploy and query gen AI models

For simple use cases, you can directly serve and query gen AI models, including high quality open-source models, as well as third-party models from LLM providers such as OpenAI and Anthropic.

Mosaic AI Model Serving supports serving and querying generative AI models using the following capabilities:

  • Foundation Model APIs. This functionality makes state-of-the-art open models and fine-tuned model variants available to your model serving endpoint. These models are curated foundation model architectures that support optimized inference. Base models, like DBRX Instruct, Llama-2-70B-chat, BGE-Large, and Mistral-7B are available for immediate use with pay-per-token pricing, and workloads that require performance guarantees, like fine-tuned model variants, can be deployed with provisioned throughput.

  • External models. These are generative AI models that are hosted outside of Databricks. Endpoints that serve external models can be centrally governed and customers can establish rate limits and access control for them. Examples include foundation models like OpenAI’s GPT-4, Anthropic’s Claude, and others.

See Create generative AI model serving endpoints.

Mosaic AI Agent Framework

Mosaic AI Agent Framework comprises a set of tools on Databricks designed to help developers build, deploy, and evaluate production-quality agents like Retrieval Augmented Generation (RAG) applications.

It is compatible with third-party frameworks like LangChain and LlamaIndex, allowing you to develop with your preferred framework and while leveraging Databricks’ managed Unity Catalog, Agent Evaluation Framework, and other platform benefits.

Quickly iterate on agent development using the following features:

  • Create and log agents using any library and MLflow. Parameterize your agents to experiment and iterate on agent development quickly.

  • Agent tracing lets you log, analyze, and compare traces across your agent code to debug and understand how your agent responds to requests.

  • Deploy agents to production with native support for token streaming and request/response logging, plus a built-in review app to get user feedback for your agent.