Serverless GPU compute

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

This article describes serverless GPU compute on Databricks and provides recommended use cases, guidance for how to set up GPU compute resources, and feature limitations.

What is serverless GPU compute?

Serverless GPU compute is part of the Serverless compute offering. Serverless GPU compute is specialized for custom single and multi-node deep learning workloads. You can use serverless GPU compute to train and fine-tune custom models using your favorite frameworks and get state-of-the-art efficiency, performance, and quality.

Serverless GPU compute includes:

An integrated experience across Notebooks, Unity Catalog, and MLflow: You can develop your code interactively using Notebooks.
A10s and H100s GPU accelerators: Use A10 GPUs for cost-effective, small to medium machine learning and deep learning tasks, such as classic ML models or fine-tuning smaller language models. Choose H100 GPUs for large-scale AI workloads, including training or fine-tuning massive models or running advanced deep learning tasks.
Multi-GPU and multi-node support: You can run distributed training workloads multiple GPUs (A10s and H100s) and multiple nodes (A10s only) using the Serverless GPU Python API. See Distributed training.

The pre-installed packages on serverless GPU compute are not a replacement for Databricks Runtime ML. While there are common packages, not all Databricks Runtime ML dependencies and libraries are reflected in the serverless GPU compute environment.

Python environments on Serverless GPU compute

Databricks provides two managed environments to serve different use cases.

note

Workspace base environments are not supported for serverless GPU compute. Instead, use the default or AI environment, and specify additional dependencies directly in the Environments side panel or pip install them.

Default base environment

This provides a minimal environment with stable client API to ensure application compatibility. Only required Python packages are installed. This allows Databricks to upgrade the server independently, delivering performance improvements, security enhancements, and bug fixes without requiring any code changes to workloads. This is the default environment when you choose serverless GPU compute. Choose this environment if you want to fully customize the environment for your training.

For more details about package versions installed in different versions, see the release notes:

AI environment

The Databricks AI environment is available in serverless GPU environment 4. The AI environment is built on top of the default base environment with common runtime packages and packages specific to machine learning on GPUs. It contains popular machine learning libraries, including PyTorch, LangChain, Transformers, Ray, and XGBoost for model training and inference. Choose this environment for running training workloads.

For more details about package versions installed in different versions, see the release notes:

AI environment 4

Recommended use cases

Databricks recommends serverless GPU compute for any model training use case that requires training customizations and GPUs.

For example:

LLM Fine-tuning
Computer vision
Recommender systems
Reinforcement learning
Deep-learning-based time series forecasting

Requirements

A workspace in either us-west-2 or us-east-1.

Set up serverless GPU compute

To connect your notebook to serverless GPU compute and configure the environment:

From a notebook, click the Connect drop-down menu at the top and select Serverless GPU.
Click the to open the Environment side panel.
Select A10 or H100 from the Accelerator field.
Select None for the default environment or AI v4 for the AI environment from the Base environment field.
If you chose None from the Base environment field, select the Environment version.
Click Apply and then Confirm that you want to apply the serverless GPU compute to your notebook environment.

note

Connection to your compute auto-terminates after 60 minutes of inactivity.

Add libraries to the environment

You can install additional libraries to the serverless GPU compute environment. See Add dependencies to the notebook.

note

Adding dependencies using the Environments panel as seen in Add dependencies to the notebook is not supported for serverless GPU compute scheduled jobs.

Create and schedule a job

The following steps show how to create and schedule jobs for your serverless GPU compute workloads. See Create and manage scheduled notebook jobs for more details.

After you open the notebook you want to use:

Select the Schedule button on the top right.
Select Add schedule.
Populate the New schedule form with the Job name, Schedule, and Compute.
Select Create.

You can also create and schedule jobs from the Jobs and pipelines UI. See Create a new job for step-by-step guidance.

Distributed training

note

Multi-GPU distributed training is supported on both H100s and A10s. Multi-node distributed training is only supported on A10 GPUs.

See Distributed Training.

Limitations

Serverless GPU compute only supports A10 and H100 accelerators.
H100 accelerators only support single-node workflows and jobs. Multi-node workflows on H100s are not yet supported.
PrivateLink is not supported. Storage or pip repos behind PrivateLink are not supported.
Serverless GPU compute is not supported for compliance security profile workspaces (like HIPAA or PCT). Processing regulated data is not supported at this time.
Serverless GPU compute is only supported on interactive environments.
For scheduled jobs on Serverless GPU compute, auto recovery behavior for incompatible package versions that are associated with your notebook is not supported.
The maximum runtime for a workload is seven days. For model training jobs that exceed this limit, please implement checkpointing and restart the job once the maximum runtime is reached.

Data loading

See Load data on Serverless GPU compute.

Best practices

See Best practices for Serverless GPU compute.

Troubleshoot issues on Serverless GPU Compute

If you encounter problems running workloads on Serverless GPU compute, see the troubleshooting guide for common issues, workarounds, and support resources.

Notebook examples

Below are various notebook examples that demonstrate how to use Serverless GPU compute for different tasks.

Task	Description
Large language models (LLMs)	Examples for fine-tuning large language models including parameter-efficient methods like Low-Rank Adaptation (LoRA) and supervised fine-tuning approaches.
Computer vision	Examples for computer vision tasks including object detection and image classification.
Deep learning based recommender systems	Examples for building recommendation systems using modern deep learning approaches like two-tower models.
Classic ML	Examples for traditional machine learning tasks including XGBoost model training and time series forecasting.
Multi-GPU and multi-node distributed training	Examples for scaling training across multiple GPUs and nodes using the Serverless GPU API, including distributed fine-tuning.

Multi-GPU training examples

See Multi-GPU and multi-node distributed training for notebooks that demonstrate how to use various distributed training libraries for multi-GPU training.

What is serverless GPU compute?​

Python environments on Serverless GPU compute​

Default base environment​

AI environment​

Recommended use cases​

Requirements​

Set up serverless GPU compute​

Add libraries to the environment​

Create and schedule a job​

Distributed training​

Limitations​

Data loading​

Best practices​

Troubleshoot issues on Serverless GPU Compute​

Notebook examples​

Multi-GPU training examples​

What is serverless GPU compute?

Python environments on Serverless GPU compute

Default base environment

AI environment

Recommended use cases

Requirements

Set up serverless GPU compute

Add libraries to the environment

Create and schedule a job

Distributed training

Limitations

Data loading

Best practices

Troubleshoot issues on Serverless GPU Compute

Notebook examples

Multi-GPU training examples