Machine learning on Databricks

Build, deploy, and manage machine learning applications on Databricks. The integrated platform unifies the entire ML lifecycle from data preparation to production monitoring.

Looking for generative AI and AI agents? See Build AI agents on Databricks.

Get started

Try a quickstart, vibe code a model, and use notebooks.

- Get started: Build your first machine learning model on Databricks
- Build and deploy a simple classification model with scikit-learn.
- Use Genie Code for data science
- Use an AI agent to explore data, build models, and iterate.
- Databricks notebooks
- Collaborative development environment with support for Python, R, Scala, and SQL.
- Concepts: Data science and machine learning on Databricks
- Learn the core concepts behind data science and machine learning on Databricks.

Train classic machine learning models

Engineer features, create machine learning models, and track experiments.

- Feature Store
- Do feature engineering, manage features in Unity Catalog, and serve features in production.
- Model training examples
- Explore end-to-end examples for training classic ML models with popular libraries.
- Databricks Runtime for ML
- Pre-configured clusters with scikit-learn, XGBoost, MLflow, and other ML libraries, plus support for deep learning frameworks.
- MLflow tracking
- Track experiments, compare model performance, and manage the complete model development lifecycle.

Train deep learning models

Use managed compute and built-in frameworks to develop deep learning models.

- AI Runtime
- Use serverless GPU compute for custom deep learning training and inference workloads.
- Distributed training examples
- Explore examples of distributed deep learning using Ray, TorchDistributor, and DeepSpeed.
- DL best practices
- Learn about framework choice, data loading, distributed scaling, and managing the deep learning model lifecycle.
- Ray on Databricks
- Scale ML workloads with distributed computing for large-scale model training and inference.

Deploy and serve models

Deploy models to production with scalable endpoints for real-time, streaming, or batch inference.

- Model Serving
- Deploy custom models and LLMs as REST endpoints with automatic scaling and GPU support.
- AI Gateway
- Govern and monitor access to models served on Databricks with usage tracking, payload logging, and security controls.
- Batch inference
- Deploy models for batch and streaming inference and prediction on large datasets.
- Foundation model APIs
- Access and query state-of-the-art GenAI models hosted by Databricks.

Monitor and govern ML systems

Ensure model quality, data integrity, and compliance with comprehensive monitoring and governance tools.

- Unity Catalog
- Govern data, features, models, and functions with unified access control, lineage tracking, and discovery.
- MLflow for Models
- Manage the full ML lifecycle, from experiments and models to evaluation and deployment.
- Anomaly detection
- Monitor data freshness and completeness at the catalog level.
- Data profiling
- Monitor data quality, model performance, and prediction drift with automated alerts and root cause analysis.

Productionize ML workflows

Scale machine learning operations with automated workflows, CI/CD integration, and production-ready pipelines.

- Models in Unity Catalog
- Use the model registry in Unity Catalog for centralized governance and to manage the model lifecycle, including deployments.
- Lakeflow Jobs
- Build automated workflows for ML pipelines.
- Declarative Automation Bundles
- Manage Databricks infrastructure as code for CI/CD, including ML training and deployment.
- MLOps workflows
- Learn about end-to-end MLOps with automated training, testing, and deployment pipelines.

Get started​

Train classic machine learning models​

Train deep learning models​

Deploy and serve models​

Monitor and govern ML systems​

Productionize ML workflows​