Key challenges in building GenAI apps
Despite the power of modern GenAI models, production-grade generative AI applications are often challenging to build. Three key challenges can be summarized as:
- Governance: Many platforms struggle to provide unified governance, data privacy, and security for data and AI assets.
- Quality: The flexible and unpredictable behavior of GenAI models adds complexity to evaluation.
- Control: Platforms must provide flexibility, model choice, and customization.
Governance for data and AI
GenAI applications require diverse data and AI assets: tables, vector indexes, AI models, tools, and more. A GenAI platform must provide fine-grained access to these assets to developers, while providing joint governance for administrators. Without complete governance, organizations face risks such as:
- Data leakage: Sensitive customer or enterprise data can be misused without proper lineage tracking and access control, and data can inadvertently leak through model outputs if proper guardrails are not enforced.
- Compliance restrictions: Many organizations have compliance requirements such as SOC2 or HIPAA, and integrating GenAI models into compliant legacy platforms can be complex, leading to delays or restrictions in using the best models.
- Unauthorized usage or unexpected costs: Without access controls and usage guardrails, AI models might be used by unauthorized teams or incur high usage costs.
Databricks simplifies unified governance for data and AI through:
- Unity Catalog, which manages files, tables, vector indexes, feature stores, models, and tools under a unified governance model
- AI Gateway, which provides unified governance and monitoring for AI model endpoints, including safety guardrails and usage limits
- Databricks AI Security Framework, which provides a comprehensive guide to AI risk management
- Databricks AI Governance Framework, which complements the Security Framework by providing a view of governance spanning both security and operational integrity
Quality of models, agents, and apps
GenAI models produce open-ended, stochastic outputs and are often applied to open-ended problems with many "good" answers. Even defining "high quality" can be challenging and often requires iterative feedback from domain experts or users. Without robust evaluation processes, organizations face risks such as:
- Bad user experiences: If GenAI apps are not evaluated based on metrics aligned with user needs, then users can find responses unhelpful, inaccurate, or even harmful or offensive. Brand reputations can suffer in extreme cases.
- Development limbo: If quality cannot be defined or measured in ways allowing stakeholder sign-off, GenAI projects can be delayed or canceled for lack of "proof" of quality.
Databricks simplifies measuring and optimizing AI quality through:
- MLflow Evaluation and Monitoring, with built-in judges and custom scorers to measure quality, usable in both development and production monitoring
- MLflow Tracing, with both automatic and manual tracing to provide observability for both development and production
- Human feedback collection, with a built-in app for expert feedback during development and APIs for user feedback from production apps
- Methods for optimizing trade-offs between quality, cost, and latency. Agent Framework and a flexible choice of AI models provide trade-off options for fully custom agents.
Control of data and models
State-of-the-art GenAI models are provided by many model providers, as well as self-hosted open-source options. Because of data privacy and licensing complications, many platforms struggle to support this diverse ecosystem and to allow fast iteration and customization. Organizations must maintain control of their data and choice over models to avoid risks such as:
- Data privacy restrictions: Compliance or integration requirements can prevent organizations from accessing the top GenAI models from multiple providers, sacrificing flexibility and quality-cost trade-offs.
- Lack of competitive edge: If models, data, agents, and applications are not customizable based on an organization's proprietary data, then it is challenging to build intellectual property.
Databricks provides control and flexibility for data and models through:
- Foundation Model APIs, which serve frontier models from top model providers in your own Databricks environment, alongside your custom models and agents in Model Serving.
- Customized apps, agents, models, tools, and data sources built around your proprietary data. All of these levels of AI support Data Intelligence, from building apps and agents, to providing data through tools, to evaluating and optimizing agents based on your data.