AWS Graviton-enabled clusters
Databricks clusters support AWS Graviton instances. These instances use AWS-designed Graviton processors that are built on top of the Arm64 instruction set architecture. AWS claims that instance types with these processors have the best price-to-performance ratio of any instance type on Amazon EC2.
Availability
Databricks supports AWS Graviton-enabled clusters:
On Databricks Runtime 9.1 LTS and above for non-Photon, and Databricks Runtime 10.2 (Unsupported) and above for Photon.
In all AWS Regions. Note, however, that not all instance types are available in all Regions. If you select an instance type that is not available in the Region for a workspace, you get a cluster creation failure.
For AWS Graviton2 and Graviton3 processors.
Note
Delta Live Tables is not supported on Graviton-enabled clusters.
Create an AWS Graviton-enabled cluster
Use the instructions in Create a cluster to create your AWS Graviton-enabled cluster.
The process to specify the cluster’s AWS Graviton instance type depends on the method you use to create the cluster. The instructions that follow are specific for each cluster creation process:
Create button or cluster UI
Follow the instructions in Create a cluster. For Databricks runtime version, select one of the runtimes as listed in the preceding Availability section. For Worker type, Driver type, or both, select one of the available AWS Graviton instance types as listed in the preceding Availability section.
Databricks REST API
Set up authentication for the Databricks REST Workspace API, if you have not done so already.
Use your tool of choice to call the Databricks REST API, such as curl or Postman.
Call the
POST clusters/create
operation in the Clusters API. For example, you can usecurl
to make a call similar to the following:curl --netrc -X POST \ https://dbc-a1b2345c-d6e7.cloud.databricks.com/api/2.0/clusters/create \ --data @create-cluster.json
create-cluster.json
:{ "cluster_name": "my-cluster", "spark_version": "10.2.x-scala2.12", "node_type_id": "m6gd.large", "num_workers": 2 }
The preceding request payload specifies a non-Photon runtime. To specify a Photon runtime, add
runtime_engine: "PHOTON"
to the request payload, as follows. (Do not addphoton
anywhere in thespark_version
field.)For Photon:
{ "cluster_name": "my-cluster", "spark_version": "10.2.x-scala2.12", "node_type_id": "m6gd.large", "num_workers": 2, "runtime_engine": "PHOTON" }
Databricks CLI
Set up the CLI and Set up authentication, if you have not done so already.
Run the
clusters create
subcommand in the Clusters CLI (legacy). For example, you can run the subcommand similar to the following:databricks clusters create --json-file create-cluster.json
create-cluster.json
:{ "cluster_name": "my-cluster", "spark_version": "10.2.x-scala2.12", "node_type_id": "m6gd.large", "num_workers": 2 }
The preceding request payload specifies a non-Photon runtime. To specify a Photon runtime, add
runtime_engine: "PHOTON"
to the request payload, as follows. (Do not addphoton
anywhere in thespark_version
field.)For Photon:
{ "cluster_name": "my-cluster", "spark_version": "10.2.x-scala2.12", "node_type_id": "m6gd.large", "num_workers": 2, "runtime_engine": "PHOTON" }
Databricks Terraform provider
Install and configure the command line tools that Terraform needs to operate, if you have not done so already.
Create and run a Terraform configuration that creates a Databricks cluster resource. For example, you can run a minimal configuration similar to the following:
terraform { required_providers { databricks = { source = "databricks/databricks" } } } provider "databricks" { } resource "databricks_cluster" "this" { cluster_name = "my-cluster" spark_version = "10.2.x-scala2.12" node_type_id = "m6gd.large" num_workers = 2 }
The preceding request payload specifies a non-Photon runtime. To specify a Photon runtime, add
runtime_engine: "PHOTON"
to the request payload, as follows. (Do not addphoton
anywhere in thespark_version
field.)For Photon:
resource "databricks_cluster" "this" { cluster_name = "my-cluster" spark_version = "10.2.x-scala2.12" node_type_id = "m6gd.large" num_workers = 2, runtime_engine = "PHOTON" }
ARM64 ISA limitations
Floating point precision changes: typical operations like adding, subtracting, multiplying, and dividing have no change in precision. For single triangle functions such as
sin
andcos
, the upper bound on the precision difference to Intel instances is1.11e-16
.Third party support: the change in ISA may have some impact on support for third-party tools and libraries.
Mixed-instance clusters: Databricks does not support mixing AWS Graviton and non-AWS Graviton instance types, as each type requires a different Databricks Runtime.
Unsupported features
The following features do not support AWS Graviton instance types:
Python UDFs in Unity Catalog
Databricks Runtime for Machine Learning
Databricks Container Services
Delta Live Tables
Databricks SQL
See also
AWS Graviton Processor on the AWS website
AWS Graviton Getting Started on GitHub
AWS News Blog: Graviton on the AWS website