Train AI and ML models
Databricks offers flexible compute solutions tailored to different machine learning needs, ranging from managed cluster runtimes to fully serverless GPU environments.
-
- Serverless GPU compute
- Serverless GPU compute environment optimized for custom single-node and multi-node deep learning workloads.
-
- Databricks Runtime for Machine Learning
- Classic compute environment with pre-built libraries for classic machine learning and deep learning workloads.
Serverless GPU compute (Beta)
This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Databricks previews.
Serverless GPU compute is a specialized offering within the Databricks serverless ecosystem. It is optimized for custom single-node and multi-node deep learning workloads, such as fine-tuning LLMs or training computer vision models.
Key features include:
- Instant availability: Removes the need to manage underlying cluster infrastructure, allowing you to connect a notebook directly to serverless GPU resources.
- High-performance hardware: Provides access to A10 GPUs for cost-effective tasks and H100 GPUs for large-scale AI workloads.
- Managed environments: Offers a default base environment for full customization or an AI environment pre-loaded with common ML packages like Transformers and Ray.
- Flexible scaling: Supports distributed training across multiple GPUs and nodes.
Databricks Runtime for Machine Learning
Databricks Runtime for Machine Learning is a specialized runtime that automates the creation of compute resources with pre-built infrastructure. It is designed for users who want a comprehensive, ready-to-use environment for both classic machine learning and deep learning.
Key features include:
- Pre-installed libraries: Includes popular libraries like PyTorch, TensorFlow, and XGBoost, which receive frequent updates and optimized support.
- Compute versatility: Supports both CPU and GPU-based instance types, including AWS Graviton for improved price-to-performance.
- Optimization: Offers integration with Photon to accelerate Spark SQL, DataFrames, and feature engineering tasks.
- Access control: Requires dedicated access mode for secure data access through Unity Catalog.