Skip to main content

Large language models (LLMs)

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

This page provides notebook examples for fine-tuning large language models (LLMs) using Serverless GPU compute. These examples demonstrate various approaches to fine-tuning including parameter-efficient methods like Low-Rank Adaptation (LoRA) and full supervised fine-tuning.

Fine-tune Qwen2-0.5B model

The following notebook provides an example of how to efficiently fine-tune the Qwen2-0.5B model using:

  • Transformer reinforcement learning (TRL) for supervised fine-tuning
  • Liger Kernels for memory-efficient training with optimized Triton kernels.
  • LoRA for parameter-efficient fine-tuning.

Notebook

Open notebook in new tab

Fine-tune Llama-3.2-3B with Unsloth

This notebook demonstrates how to fine-tune Llama-3.2-3B using the Unsloth library.

Unsloth Llama

Open notebook in new tab

Video demo

This video walks through the notebook in detail (12 minutes).

Fine-tune a GPT OSS 20B model

This notebook demonstrates how to fine-tune OpenAI's gpt-oss-20b model on a H100 GPU using LoRA for parameter-efficient fine-tuning.

Fine-tune GPT

Open notebook in new tab

Supervised fine-tuning using DeepSpeed and TRL

This notebook demonstrates how to use the Serverless GPU Python API to run supervised fine-tuning (SFT) using the Transformer Reinforcement Learning (TRL) library with DeepSpeed ZeRO Stage 3 optimization.

TRL DeepSpeed

Open notebook in new tab

LORA fine-tuning using Axolotl

This notebook demostrates how to use the Serverless GPU Python API to LORA fine-tune an Olmo3 7B model using the Axolotl library.

Axolotl

Open notebook in new tab