Machine learning with Mosaic AI
Mosaic AI unifies the AI lifecycle from data collection and preparation, to model development and LLMOps, to serving and monitoring. With Mosaic AI, a single platform serves every step of ML development and deployment. Data scientists, data engineers, ML engineers and DevOps can do their jobs using the same set of tools and a single source of truth for the data.
This article describes the tools SAP Databricks provides to help you build artificial intelligence (AI) and machine learning (ML) systems.
Features
The following AI and ML features are included in SAP Databricks:
- AI Playground for testing generative AI models from your Databricks workspace. You can prompt, compare and adjust settings such as system prompt and inference parameters.
- AI Functions that you can use to apply AI, like text translation or sentiment analysis, on your data that is stored on Databricks.
- Mosaic AI Gateway for governing and monitoring access to supported generative AI models and their associated model serving endpoints.
- Mosaic AI Model Serving for deploying LLMs.
- Mosaic AI Vector Search provides a queryable vector database that stores embedding vectors and can be configured to automatically sync to your knowledge base.
- Lakehouse Monitoring for data monitoring and tracking model prediction quality and drift using automatic payload logging with inference tables.
- Managed MLflow for AI agent and ML model lifecycle.
- Mosaic AI Agent Framework for building and deploying production-quality agents like Retrieval Augmented Generation (RAG) applications.
- Mosaic AI Agent Evaluation for evaluating the quality, cost, and latency of generative AI applications, including RAG applications and chains.
- AutoML to simplify the process of applying machine learning to your datasets.
- Foundation Model Fine-tuning for customizing a foundation model using your own data to optimize its performance for your specific application.
- Unity Catalog for managing AI assets, including models and experiments.
Manage AI assets with Unity Catalog
Mosaic AI unifies the data layer and ML platform. All data assets and artifacts, such as models and functions, are discoverable and governed in a single catalog. Using a single platform for data and models makes it possible to track lineage from the raw data to the production model. Built-in data and model monitoring saves quality metrics to tables that are also stored in the platform, making it easier to identify the root cause of model performance problems.
Chat with LLMs and prototype generative AI apps using AI Playground
You can interact with supported large language models using the AI Playground. The AI Playground is a chat-like environment where you can test, prompt, and compare LLMs. See Chat with LLMs and prototype generative AI apps using AI Playground.
Use AI Functions in SQL
AI Functions are built-in functions that you can use to apply AI, like text translation or sentiment analysis, on your data that is stored on Databricks. They can be run from the notebook and SQL editor. Analysts, data scientists, and machine learning engineers can use AI functions to apply data intelligence to their proprietary data. See Apply AI on data using SAP Databricks AI Functions.
AI Gateway
Mosaic AI Gateway is designed to streamline the usage and management of generative AI models and agents within an organization. It is a centralized service that brings governance, monitoring, and production readiness to model serving endpoints. It also allows you to run, secure, and govern AI traffic to democratize and accelerate AI adoption for your organization. See Configure AI Gateway on model serving endpoints.
Deploy models using Mosaic AI Model Serving
Mosaic AI Model Serving provides a unified interface to deploy, govern, and query AI models for real-time and batch inference. Each model you serve is available as a REST API that you can integrate into your web or client application.
You can configure a model serving endpoint specifically for accessing generative AI models:
- State-of-the-art open LLMs using Foundation Model APIs.
- Third-party models hosted outside of Databricks.
Mosaic AI Vector Search
Mosaic AI Vector Search is a vector search solution that is built into the Databricks Data Intelligence Platform and integrated with its governance and productivity tools. Mosaic AI Vector Search provides a queryable vector database that stores embedding vectors and can be configured to automatically sync to your knowledge base. Embeddings are crucial for applications that require similarity searches, such as RAG (Retrieval Augmented Generation), recommendation systems, and image recognition.
Lakehouse monitoring
Databricks Lakehouse Monitoring lets you monitor the statistical properties and quality of the data in all of the tables in your account. You can also use it to track the performance of machine learning models and model-serving endpoints by monitoring inference tables that contain model inputs and predictions. See Monitor data and AI assets with Lakehouse Monitoring.
Managed MLflow
SAP Databricks provides a managed version of MLflow 2.0. MLflow is an open source platform for developing models and generative AI applications. It has the following primary components:
- Tracking: Allows you to track experiments to record and compare parameters and results.
- Models: Allow you to manage and deploy models from various ML libraries to various model serving and inference platforms.
- Model Registry: Allows you to manage the model deployment process from staging to production, with model versioning and annotation capabilities.
- AI agent evaluation and tracing: Allows you to develop high-quality AI agents by helping you compare, evaluate, and troubleshoot agents.
Mosaic AI Agent Framework
Mosaic AI Agent Framework comprises a set of tools on Databricks designed to help developers build, deploy, and evaluate production-quality agents like Retrieval Augmented Generation (RAG) applications.
It is compatible with third-party frameworks like LangChain and LlamaIndex, allowing you to develop with your preferred framework and while leveraging Databricks' managed Unity Catalog, Agent Evaluation Framework, and other platform benefits.
Mosaic AI Agent Evaluation
Mosaic AI Agent Evaluation helps developers evaluate the quality, cost, and latency of agentic AI applications, including RAG applications and chains. Agent Evaluation is designed to both identify quality issues and determine the root cause of those issues. The capabilities of Agent Evaluation are unified across the development, staging, and production phases of the MLOps life cycle, and all evaluation metrics and data are logged to MLflow Runs.
AutoML forecasting
AutoML forecasting simplifies forecasting time-series data by automatically selecting the best algorithm and hyperparameters, all while running on fully-managed compute resources. To run a forecasting experiment, see AutoML forecasting.
Foundation Model Fine-tuning
To use this product, your workspace must be in the supported region, us-east-1
.
With Foundation Model Fine-tuning, you can use your own data to customize a foundation model to optimize its performance for your specific application. By conducting fine-tuning or continuing training of a foundation model, you can train your own model using significantly less data, time, and compute resources than training a model from scratch.
Learn more
To learn more about machine learning and artificial intelligence features on Databricks, see the complete AI and machine learning on Databricks in AWS documentation. NOTE: The set of supported features on SAP Databricks differs from Databricks in AWS.