Large language models (LLMs)
Beta
This feature is in Beta.
This page provides notebook examples for fine-tuning large language models (LLMs) using Serverless GPU compute. These examples demonstrate various approaches to fine-tuning including parameter-efficient methods like Low-Rank Adaptation (LoRA) and full supervised fine-tuning.
Fine-tune Qwen2-0.5B model
The following notebook provides an example of how to efficiently fine-tune the Qwen2-0.5B model using:
- Transformer reinforcement learning (TRL) for supervised fine-tuning
- Liger Kernels for memory-efficient training with optimized Triton kernels.
- LoRA for parameter-efficient fine-tuning.
Notebook
Fine-tune Llama-3.2-3B with Unsloth
This notebook demonstrates how to fine-tune Llama-3.2-3B using the Unsloth library.
Notebook
Fine-tune an embedding model
This notebook demonstrates how to fine-tune the gte-large-en-v1.5
embedding model on a single A10G using Mosaic LLM Foundry.
Notebook
Fine-tune a GPT OSS 20B model
This notebook demonstrates how to fine-tune OpenAI's gpt-oss-20b
model on a H100 GPU using LoRA for parameter-efficient fine-tuning.