Get started: Serverless GPU compute with H100 GPUs

This notebook demonstrates how to use Databricks Serverless GPU compute with H100 accelerators. You'll learn how to connect to H100 GPUs and run distributed workloads using the serverless_gpu Python library.

The serverless_gpu library enables seamless execution of GPU workloads directly from Databricks notebooks. It provides decorators and runtime utilities for distributed GPU computing. To learn more, see the Serverless GPU API documentation.

Connect to serverless GPU compute

To run this notebook, you need access to Databricks Serverless GPU compute with H100 accelerators.

From the compute selector, select Serverless GPU.
In the "Environment" tab on the right side, select H100 for your accelerator. This option uses 8 H100 chips on a single node.
Click Apply.

See the Hello World example below for how to target remote GPUs to scale to more resources.

When to use H100 GPUs

Compared to A10s, H100s offer larger floating-point operations per second (FLOPS) and high-bandwidth memory (HBM). Use H100s for large model training where high throughput and/or large GPU memory is needed.

Verify GPU connection

Use the nvidia-smi command to confirm that you're connected to 8 H100 GPUs. This command displays GPU information including model, memory, and utilization.

Python
%sh nvidia-smi

Output
Thu Jan 15 17:56:54 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08              Driver Version: 575.57.08      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:53:00.0 Off |                    0 |
| N/A   26C    P0             70W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:64:00.0 Off |                    0 |
| N/A   28C    P0             68W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:75:00.0 Off |                    0 |
| N/A   26C    P0             71W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:86:00.0 Off |                    0 |
| N/A   29C    P0             68W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H100 80GB HBM3          On  |   00000000:97:00.0 Off |                    0 |
| N/A   27C    P0             67W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H100 80GB HBM3          On  |   00000000:A8:00.0 Off |                    0 |
| N/A   26C    P0             67W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H100 80GB HBM3          On  |   00000000:B9:00.0 Off |                    0 |
| N/A   26C    P0             69W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H100 80GB HBM3          On  |   00000000:CA:00.0 Off |                    0 |
| N/A   26C    P0             67W /  700W |       0MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Hello World example

This example demonstrates how to run a distributed function across multiple GPUs using the @distributed decorator.

The annotated function below is launched on 8 processes, one per GPU on the node the notebook is attached to. The launch annotation specifies the number of GPUs.

The function uses the runtime module to access the local and global GPU ranks.

Python
from serverless_gpu import distributed
from serverless_gpu import runtime as rt

@distributed(
    gpus=8,
    gpu_type='h100',
)
def hello_world(name: str) -> list[int]:
    if rt.get_local_rank() == 0:
        print('hello world', name)
    return rt.get_global_rank()

result = hello_world.distributed('SGC')

Python
assert result == [0, 1, 2, 3, 4, 5, 6, 7]

Next steps

Example notebook

Get started: Serverless GPU compute with H100 GPUs

Open notebook in new tab

Connect to serverless GPU compute​

When to use H100 GPUs​

Verify GPU connection​

Hello World example​

Next steps​

Example notebook​

Get started: Serverless GPU compute with H100 GPUs

Connect to serverless GPU compute

When to use H100 GPUs

Verify GPU connection

Hello World example

Next steps

Example notebook