Skip to main content

Multi-node distributed training

Beta

This feature is in Beta.

This page provides notebook examples for multi-node distributed training using Serverless GPU compute. These examples demonstrate how to scale training across multiple GPUs and nodes for improved performance.

Serverless GPU API: A10 starter

The following notebook provides a basic example of how to use the Serverless GPU Python API to launch multiple A10 GPUs for distributed training.

Notebook

Open notebook in new tab

Distributed training and hyperparameter sweeps

The following notebook provides an example of distributed training and hyperparameter sweeps fine-tuning using the Serverless GPU Python API.

Notebook

Open notebook in new tab

Distributed supervised fine-tuning using TRL

This notebook demonstrates how to use Databricks Serverless GPU to run supervised fine-tuning (SFT) using the TRL library with DeepSpeed ZeRO Stage 3 optimization on a single node A10 GPU. This approach can be extended to multi-node setups.

Notebook

Open notebook in new tab