Introduction: End-to-end generative AI app guide

This generative AI app guide (formerly called the AI cookbook) and its sample code take you from a proof-of-concept (POC) to a high-quality production-ready application using Mosaic AI Agent Evaluation and Mosaic AI Agent Framework on the Databricks platform. You can also use the GitHub repository as a template with which to create your own AI applications.

See a list of the pages in the Generative AI app guide.

tip

There are a few ways you can build a gen AI app using this guide:

You only have a few minutes and want to see a demo of Mosaic AI Agent Framework & Agent Evaluation.
You want to get directly into code and deploy a gen AI app POC using your data.
You don’t have any data, but want to deploy a sample gen AI application.

What do we mean by high-quality AI?

The Databricks generative AI app guide is a how-to guide for building high-quality generative AI applications. High-quality applications are:

Accurate: They provide correct responses
Safe: They do not deliver harmful or insecure responses
Governed: They respect data permissions & access controls and track lineage

This guide lays out best-practice development workflow from Databricks for building high-quality gen AI apps: evaluation-driven development. It outlines the most relevant ways to increase RAG application quality and provides a comprehensive repository of sample code implementing those techniques.

The Databricks approach to quality

Databricks takes the following approach to AI quality:

Fast, code-first developer loop to rapidly iterate on quality.
Make it easy to collect human feedback.
Provide a framework for rapid and reliable measurement of app quality.

Animated walkthrough of the Mosaic AI review app in Databricks.

This guide is intended for use with the Databricks platform. Specifically:

Mosaic AI Agent Framework that provides a fast developer workflow with enterprise-ready LLMops & governance.
Mosaic AI Agent Evaluation that provides reliable, quality measurement using proprietary AI-assisted LLM judges to measure quality metrics that are powered by human feedback collected through an intuitive web-based chat UI.

Code-based workflows

Choose the workflow below that most meets your needs:

Time required	What you’ll build	Link
10 minutes	Sample gen AI app deployed to web-based chat app that collects feedback	Gen AI app demo
2 hours	POC gen AI app with your data deployed to a chat UI that can collect feedback from your business stakeholders	Build and deploy a POC
1 hour	Comprehensive quality, cost, and latency evaluation of your POC app	Evaluate your POC Identify the root causes of quality issues

What do we mean by high-quality AI?​

The Databricks approach to quality​

Code-based workflows​

What do we mean by high-quality AI?

The Databricks approach to quality

Code-based workflows