Skip to main content

Get started: Serverless GPU compute with H100 GPUs

This notebook demonstrates how to use Databricks Serverless GPU compute with H100 accelerators. You'll learn how to connect to H100 GPUs and run distributed workloads using the serverless_gpu Python library.

The serverless_gpu library enables seamless execution of GPU workloads directly from Databricks notebooks. It provides decorators and runtime utilities for distributed GPU computing. To learn more, see the Serverless GPU API documentation.

Connect to serverless GPU compute

To run this notebook, you need access to Databricks Serverless GPU compute with H100 accelerators.

  1. From the compute selector, select Serverless GPU.
  2. In the "Environment" tab on the right side, select H100 for your accelerator. This option uses 8 H100 chips on a single node.
  3. Click Apply.

See the Hello World example below for how to target remote GPUs to scale to more resources.

When to use H100 GPUs

Compared to A10s, H100s offer larger floating-point operations per second (FLOPS) and high-bandwidth memory (HBM). Use H100s for large model training where high throughput and/or large GPU memory is needed.

Verify GPU connection

Use the nvidia-smi command to confirm that you're connected to 8 H100 GPUs. This command displays GPU information including model, memory, and utilization.

Python
%sh nvidia-smi
Output
Thu Jan 15 17:56:54 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08 Driver Version: 575.57.08 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H100 80GB HBM3 On | 00000000:53:00.0 Off | 0 |
| N/A 26C P0 70W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA H100 80GB HBM3 On | 00000000:64:00.0 Off | 0 |
| N/A 28C P0 68W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA H100 80GB HBM3 On | 00000000:75:00.0 Off | 0 |
| N/A 26C P0 71W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA H100 80GB HBM3 On | 00000000:86:00.0 Off | 0 |
| N/A 29C P0 68W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA H100 80GB HBM3 On | 00000000:97:00.0 Off | 0 |
| N/A 27C P0 67W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 5 NVIDIA H100 80GB HBM3 On | 00000000:A8:00.0 Off | 0 |
| N/A 26C P0 67W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 6 NVIDIA H100 80GB HBM3 On | 00000000:B9:00.0 Off | 0 |
| N/A 26C P0 69W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 7 NVIDIA H100 80GB HBM3 On | 00000000:CA:00.0 Off | 0 |
| N/A 26C P0 67W / 700W | 0MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

Hello World example

This example demonstrates how to run a distributed function across multiple GPUs using the @distributed decorator.

The annotated function below is launched on 8 processes, one per GPU on the node the notebook is attached to. The launch annotation specifies the number of GPUs.

The function uses the runtime module to access the local and global GPU ranks.

Set remote to False to launch on the H100s connected to the notebook. Set it to True to provision remote GPU resources.

Python
from serverless_gpu import distributed
from serverless_gpu import runtime as rt

@distributed(
gpus=8,
gpu_type='h100',
remote=False, # Use the GPUs the notebook is running on
)
def hello_world(name: str) -> list[int]:
if rt.get_local_rank() == 0:
print('hello world', name)
return rt.get_global_rank()

result = hello_world.distributed('SGC')

Output
Warning: serverless_gpu is in Beta. The API is subject to change.
Using log_dir='/tmp/SGC_logs_7_yw1eno'
Output
Warning: serverless_gpu is in Beta. The API is subject to change.
Output
hello world SGC
Warning: serverless_gpu is in Beta. The API is subject to change.
Python
assert result == [0, 1, 2, 3, 4, 5, 6, 7]

Next steps

Example notebook

Get started: Serverless GPU compute with H100 GPUs

Open notebook in new tab