Train AI and ML models

This section shows you how to train machine learning and AI models on Mosaic AI.

AutoML

Databricks AutoML simplifies the process of applying machine learning to your datasets by automatically finding the best algorithm and hyperparameter configuration for you. AutoML offers a low-code UI as well as a Python API.

Mosaic AI Model Training

Mosaic AI Model Training (formerly Foundation Model Training) on Databricks lets you customize large language models (LLMs) using your own data. This process involves fine-tuning the training of a pre-existing foundation model, significantly reducing the data, time, and compute resources required compared to training a model from scratch. Key features include:

  • Supervised fine-tuning: Adapt your model to new tasks by training on structured prompt-response data.

  • Continued pre-training: Enhance your model with additional text data to add new knowledge or focus on a specific domain.

  • Chat completion: Train your model on chat logs to improve conversational abilities.

Open source library examples

See machine learning training examples from a wide variety of open source machine learning libraries, including hyperparameter tuning examples using Optuna and Hyperopt.

Deep learning

See examples and best practices for distributed deep learning training so you can develop and fine-tune deep learning models on Databricks.

Recommenders

Learn how to train deep-learning-based recommendation models on Databricks. Compared to traditional recommendation models, deep learning models can achieve higher quality results and scale to larger amounts of data.