Large language models (LLMs)

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

This page provides notebook examples for fine-tuning large language models (LLMs) using Serverless GPU compute. These examples demonstrate various approaches to fine-tuning including parameter-efficient methods like Low-Rank Adaptation (LoRA) and full supervised fine-tuning.

Tutorial	Description
Fine-tune Qwen2-0.5B model	Efficiently fine-tune the Qwen2-0.5B model using Transformer Reinforcement Learning (TRL), Liger Kernels for memory-efficient training, and LoRA for parameter-efficient fine-tuning.
Fine-tune Llama-3.2-3B with Unsloth	Fine-tune Llama-3.2-3B using the Unsloth library.
Fine-tune a GPT OSS 20B model	Fine-tune OpenAI's `gpt-oss-20b` model on a H100 GPU using LoRA for parameter-efficient fine-tuning.
Supervised fine-tuning using DeepSpeed and TRL	Use the Serverless GPU Python API to run supervised fine-tuning (SFT) using the TRL library with DeepSpeed ZeRO Stage 3 optimization.
LoRA fine-tuning using Axolotl	Use the Serverless GPU Python API to LoRA fine-tune an Olmo3 7B model using the Axolotl library.

Video demo

This video walks through the Fine-tune Llama-3.2-3B with Unsloth example notebook in detail (12 minutes).

Video demo​

Video demo