AWS Graviton-enabled clusters

Databricks clusters support AWS Graviton instances. These instances use AWS-designed Graviton processors that are built on top of the Arm64 instruction set architecture. AWS claims that instance types with these processors have the best price-to-performance ratio of any instance type on Amazon EC2.

Availability

Databricks supports AWS Graviton-enabled clusters:

  • On Databricks Runtime 9.1 LTS and above for non-Photon, and Databricks Runtime 10.2 (unsupported) and above for Photon.

  • In all AWS Regions. Note, however, that not all instance types are available in all Regions. If you select an instance type that is not available in the Region for a workspace, you get a cluster creation failure.

  • For AWS Graviton2 and Graviton3 processors.

Note

Delta Live Tables is not supported on Graviton-enabled clusters.

Create an AWS Graviton-enabled cluster

Use the instructions in Create a cluster to create your AWS Graviton-enabled cluster.

The process to specify the cluster’s AWS Graviton instance type depends on the method you use to create the cluster. The instructions that follow are specific for each cluster creation process:

Create button or cluster UI

Follow the instructions in Create a cluster. For Databricks runtime version, select one of the runtimes as listed in the preceding Availability section. For Worker type, Driver type, or both, select one of the available AWS Graviton instance types as listed in the preceding Availability section.

Databricks REST API

  1. Set up authentication for the Databricks REST Workspace API, if you have not done so already.

  2. Use your tool of choice to call the Databricks REST API, such as curl or Postman.

  3. Call the POST /api/2.0/clusters/create operation. For example, you can use curl to make a call similar to the following. In this curl call example and its related JSON request payload file, replace the following placeholders:

    • Replace <workspace-instance> <databricks-instance>.

    • Replace <personal-access-token> with the Databricks personal access token that is associated with your Databricks user account for this workspace.

    • Replace <cluster-name> with some display name for the new cluster, for example My New Cluster.

    • Replace <spark-version> with the ID of the Spark version for the new cluster, for example 13.3.x-scala2.12. To get a list of available Spark version IDs, call the GET /api/2.0/clusters/spark-versions operation.

    • Replace <node-type-id> with the ID of the node type for the new cluster, for example m6gd.large. To get a list of available node type IDs, call the GET /api/2.0/clusters/list-node-types operation. Note, however, that not all node types are available in all Regions. If you select a node type that is not available in the Region for a workspace, you get a cluster creation failure.

    • For num_workers, replace 1 with the number of worker nodes for the new cluster.

    curl --request POST \
    https://<workspace-instance>/api/2.0/clusters/create \
    --header "Authorization: Bearer <personal-access-token>" \
    --data @create-cluster.json
    

    In a file named create-cluster.json in the same directory where you run the preceding curl call, add content such as the following:

    {
      "cluster_name": "<cluster-name>",
      "spark_version": "<spark-version>",
      "node_type_id": "<node-type-id>",
      "num_workers": 1
    }
    

    The preceding request specifies a non-Photon runtime. To specify a Photon runtime, add the PHOTON runtime engine setting, as follows. (Do not add PHOTON anywhere in the Spark version setting.)

    For Photon:

    {
      "cluster_name": "<cluster-name>",
      "spark_version": "<spark-version>",
      "node_type_id": "<node-type-id>",
      "num_workers": 1,
      "runtime_engine": "PHOTON"
    }
    

Databricks CLI

  1. Install the Databricks CLI and set up the Databricks CLI for authentication, if you have not done so already.

  2. Run the clusters create subcommand. For example, you can run the subcommand as follows. In this command example, replace the following placeholders:

    • Replace <spark-version> with the ID of the Spark version for the new cluster, for example 13.3.x-scala2.12. To get a list of available Spark version IDs, run the command databricks clusters spark-versions.

    • Replace <cluster-name> with some display name for the new cluster, for example My New Cluster.

    • Replace <node-type-id> with the ID of the node type for the new cluster, for example m6gd.large. To get a list of available node type IDs, run the command databricks clusters list-node-types. Note, however, that not all node types are available in all Regions. If you select a node type that is not available in the Region for a workspace, you get a cluster creation failure.

    • Replace <num-workers> with the number of worker nodes for the new cluster.

    • Replace <profile> with the name of a Databricks configuration profile for your target Databricks authentication type.

    databricks clusters create <spark-version> \
    --cluster-name "<cluster-name>" \
    --node-type-id <node-type-id> \
    --num-workers <num-workers> \
    --profile <profile>
    

    The preceding request specifies a non-Photon runtime. To specify a Photon runtime, add the PHOTON runtime engine setting, as follows. (Do not add PHOTON anywhere in the Spark version setting.)

    For Photon:

    databricks clusters create <spark-version> \
    --cluster-name "<cluster-name>" \
    --node-type-id <node-type-id> \
    --num-workers <num-workers> \
    --runtime-engine PHOTON \
    --profile <profile>
    

Databricks Terraform provider

  1. Install and configure the command line tools that Terraform needs to operate, if you have not done so already.

  2. Create and run a Terraform configuration that creates a Databricks cluster resource. For example, you can run a minimal configuration similar to the following. In this configuration example, replace the following placeholders:

    • Replace <cluster-name> with some display name for the new cluster, for example My New Cluster.

    • Replace <spark-version> with the ID of the Spark version for the new cluster, for example 13.3.x-scala2.12. To get a list of available Spark version IDs, call the GET /api/2.0/clusters/spark-versions operation in the Databricks REST API, or run the Databricks CLI command databricks clusters spark-versions.

    • Replace <node-type-id> with the ID of the node type for the new cluster, for example m6gd.large. To get a list of available node type IDs, call the GET /api/2.0/clusters/list-node-types operation in the Databricks REST API, or run the Databricks CLI command databricks clusters list-node-types. Note, however, that not all node types are available in all Regions. If you select a node type that is not available in the Region for a workspace, you get a cluster creation failure.

    • For num_workers, replace 1 with the number of worker nodes for the new cluster.:

    terraform {
      required_providers {
        databricks = {
          source = "databricks/databricks"
        }
      }
    }
    
    provider "databricks" {
    }
    
    resource "databricks_cluster" "this" {
      cluster_name  = "<cluster-name>"
      spark_version = "<spark-version>"
      node_type_id  = "<node-type-id>"
      num_workers   = 1
    }
    

    The preceding request specifies a non-Photon runtime. To specify a Photon runtime, add the PHOTON runtime engine setting, as follows. (Do not add PHOTON anywhere in the Spark version setting.)

    For Photon:

    resource "databricks_cluster" "this" {
      cluster_name   = "<cluster-name>"
      spark_version  = "<spark-version>"
      node_type_id   = "<node-type-id>"
      num_workers    = 1,
      runtime_engine = "PHOTON"
    }
    

ARM64 ISA limitations

  • Floating point precision changes: typical operations like adding, subtracting, multiplying, and dividing have no change in precision. For single triangle functions such as sin and cos, the upper bound on the precision difference to Intel instances is 1.11e-16.

  • Third party support: the change in ISA may have some impact on support for third-party tools and libraries.

  • Mixed-instance clusters: Databricks does not support mixing AWS Graviton and non-AWS Graviton instance types, as each type requires a different Databricks Runtime.

Unsupported features

The following features do not support AWS Graviton instance types:

  • Compliance security profile

  • Python UDFs in Unity Catalog

  • Databricks Runtime for Machine Learning

  • Databricks Container Services

  • Delta Live Tables

  • Databricks SQL

See also