AI Runtime

Public Preview

AI Runtime for single-node tasks is in Public Preview. The distributed training API for multi-GPU workloads remain in Beta.

Overview of AI Runtime

AI Runtime is a compute offering at Databricks intended for deep learning workloads, and brings GPU support for Databricks Serverless. You can use AI Runtime to train and fine-tune custom models using your favorite frameworks and get state-of-the-art efficiency, performance, and quality. For an overview of how serverless compute fits into the Databricks architecture, see Serverless workspace architecture.

Key features

Fully managed GPU infrastructure — Serverless, flexible access to GPUs and no cluster configuration, driver selection, or autoscaling policies to manage.
A runtime dedicated for deep learning — Choose either a minimal default base environment for maximum flexibility over dependencies or a full-featured AI environment pre-loaded with popular ML frameworks.
Natively integrated across notebooks, jobs, Unity Catalog, and MLflow for seamless development, data access, and experiment tracking.

Hardware options

Accelerator	Best For	Multi-GPU
A10	Small to medium ML and deep learning tasks such as classic ML models or fine-tuning smaller language models	No
H100	Large-scale AI workloads including training or fine-tuning massive models or running advanced deep learning tasks	Yes (8 GPUs)

Recommended use cases

Databricks recommends AI Runtime for any custom model training use cases that involve deep learning, large-scale classic workloads, or GPUs.

For example:

LLM fine-tuning (LoRA, QLoRA, full fine-tuning)
Computer vision (object detection, image classification)
Deep-learning-based recommender systems
Reinforcement learning
Deep-learning-based time series forecasting

Requirements

A workspace in one of the following AWS-supported regions:
- us-west-2
- us-west-1
- us-east-1
- us-east-2
- ca-central-1
- sa-east-1
The AI Runtime preview must be enabled via workspace admin settings. See Manage Databricks previews.

Limitations

AI Runtime only supports A10 and H100 accelerators.
AI Runtime is not supported for compliance security profile workspaces (like HIPAA or PCI).
Adding dependencies using the Environments panel is not supported for AI Runtime scheduled jobs. Install dependencies programmatically using %pip install in your notebook instead.
For scheduled jobs on AI Runtime, auto recovery behavior for incompatible package versions that are associated with your notebook is not supported.
The maximum runtime for a workload is seven days. For model training jobs that exceed this limit, implement checkpointing and restart the job once the maximum runtime is reached.
AI Runtime provides on-demand access to GPU resources. While this leads to easy, flexible access to GPUs, there may be periods where capacity is constrained or unavailable in your region.
AI Runtime leverages cross-region GPUs in certain cases during moments of high demand. There may be egress costs associated with such usage.

Connect to AI Runtime

You can connect to AI Runtime interactively from notebooks, schedule notebooks as recurring jobs, or programmatically create jobs using the Jobs API and Databricks Asset Bundles. For step-by-step instructions, see Connect to AI Runtime.

Set up environment

AI Runtime offers two managed Python environments: a minimal default base environment, and a full-featured Databricks AI environment that is pre-loaded with popular ML frameworks like PyTorch and Transformers. For details on choosing an environment, caching behavior, importing custom modules, and known limitations, see Set up your environment.

Read in data

Understanding how data access works on AI Runtime is essential for a smooth experience. For details, see Load data on AI Runtime.

Distributed training

Beta

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.

AI Runtime supports distributed training across multiple GPUs on the single node your notebook is connected to. Using the @distributed decorator from the serverless_gpu Python API (Beta), you can launch multi-GPU workloads with PyTorch DDP, FSDP, or DeepSpeed with minimal configuration. For details, see Multi-GPU workload.

Experiment tracking and observability

For MLflow integration, viewing logs, and model checkpoint management, see Experiment tracking and observability.

Genie Code for deep learning

Genie Code supports deep learning workloads on AI Runtime. It can help with generating training code, resolving library installation errors, suggesting optimizations, and debugging common issues. See Use Genie Code for data science.

Guides

For migration from classic workloads, example notebooks, and troubleshooting, see User guides for AI Runtime.

Overview of AI Runtime​

Key features​

Hardware options​

Recommended use cases​

Requirements​

Limitations​

Connect to AI Runtime​

Set up environment​

Read in data​

Distributed training​

Experiment tracking and observability​

Genie Code for deep learning​

Guides​