Databricks Container Services for standard compute

Beta

Databricks Container Services for standard compute is in Beta. A workspace admin must enable this feature from the workspace Previews page. This is a separate service from Databricks Container Services for dedicated compute, which is generally available.

Databricks Container Services for standard compute lets you specify a Docker image when you create standard compute, giving you access to custom containers in shared compute environments. Your Docker image is the only definition of the workload environment, so you can reproduce the remote environment locally for consistent results across development and production.

Additionally, to help build your custom image, Databricks provides a base image aligned with serverless environment versions that you can extend to meet your needs.

Requirements

To use Databricks Container Services for standard compute:

The compute resource must be running Databricks Runtime 18.3 or above and use the Standard access mode.
You must have a recent Docker daemon with the docker command available on your PATH.

Step 1: Enable Databricks Container Services for standard compute

To use Databricks Container Services for standard compute, a workspace admin must enable the feature from the Previews page:

Sign in to your Databricks workspace as an administrator.
From the user menu in the upper right, click Previews.
Find DCS for standard compute and turn it on.

Step 2: Build your custom image

These instructions show you how to build a custom image by extending a Databricks-provided base image (recommended). The base image contains the dependencies required to launch your workloads, such as Ubuntu, Python, and JDK. You can pull databricksruntime/environment:v5-standard, layer your packages on top, and inherit ongoing Databricks-managed updates and security patches.

If you would like to build a minimal base image from scratch, see Reference: build a minimal base image from scratch.

Step 2a: Pull the base image

To pull the base image, run:

Bash
docker pull databricksruntime/environment:v5-standard

For AWS Graviton (ARM) instance types, use the ARM variant:

databricksruntime/environment:v5-standard-arm

Step 2b: Write a Dockerfile that extends the base image

Install custom Python packages into the base image's /databricks/python3 virtual environment. This is the system virtual environment that launches your workloads.

If you are targeting AWS Graviton instance types, replace :v5-standard with :v5-standard-arm in the FROM line.

Dockerfile
FROM databricksruntime/environment:v5-standard

RUN /databricks/python3/bin/python -m pip install <your python package>

The following example shows how to install a package from a private repository.

Dockerfile
FROM databricksruntime/environment:v5-standard

ENV PIP_INDEX_URL=https://pypi.org/simple

RUN /databricks/python3/bin/python -m pip install --no-cache-dir simplejson

You can use any standard Dockerfile instruction (for example, RUN, ENV, WORKDIR, COPY). The following instructions are ignored because of how Databricks launches your workload:

USER
CMD
ENTRYPOINT
EXPOSE
HEALTHCHECK
SHELL
STOPSIGNAL

note

For Scala workloads, copy your JAR files into the /scala-jars/user directory in the image and chmod 0644 them so the sandbox user can read them. Databricks loads JARs from this path onto the Spark classpath.

Step 2c: Build the image

To build the image, run:

Bash
docker build -f <your-dockerfile> -t <registry-url>/<project>[/<repo>]:<tag> .

warning

Test your custom image thoroughly on a Databricks compute. An image that works on a local or build machine might fail to start, silently disable features, or stop working when launched on Databricks.

Reference: build a minimal base image from scratch

If you need full control over the contents of your base image (for example, to meet strict image-size, supply-chain, or compliance requirements), you can build a minimal equivalent of databricksruntime/environment:v5-standard from scratch instead of extending it.

warning

Building from scratch is an advanced option. You take on responsibility for tracking upstream changes to the v5-standard image including Python pins, security patches, platform tooling, and the platform-required files under /databricks/ and /etc/environment. Instead, Databricks recommends extending databricksruntime/environment:v5-standard as shown earlier in Step 2.

Databricks provides a reference Dockerfile and requirements.txt that recreate the essential Python environment of v5-standard. Download both files into the same directory before building:

Dockerfile (save as Dockerfile, without the .txt extension)
requirements.txt

To build the image, run:

Bash
docker build -t <your-registry>/<repo>:<tag> .

If your build host cannot reach https://pypi.org, override the pip index at build time by running:

Bash
docker build --build-arg PIP_INDEX_URL=https://your-mirror/simple -t <your-registry>/<repo>:<tag> .

Before continuiing to the next step, verify that the curated Python packages import cleanly by running:

Bash
docker run --rm --cpus 2 <your-registry>/<repo>:<tag> \
  /databricks/python3/bin/python -c \
  "import pandas, numpy, pyarrow, mlflow, databricks.connect; print('OK')"

Step 3: Push your image to a registry

Next, push your image to a Docker registry. Databricks Container Services supports the same registries on both standard and dedicated compute:

Docker Hub with no authentication or basic authentication.
Azure Container Registry with basic authentication.

Amazon Elastic Container Registry (Amazon ECR) with IAM (with the exception of Commercial Cloud Services (C2S)).

Other registries that support no authentication or basic authentication should also work. Basic authentication uses your registry username and password.

For best image-pull performance, use a registry in the same cloud and region as your Databricks workspace.

Bash
echo "$REGISTRY_PASSWORD" | docker login -u <registry-username> --password-stdin <registry-url>
docker push <registry-url>/<project>[/<repo>]:<tag>

note

If you use Docker Hub, check that your rate limits accommodate the compute you expect to launch in a six-hour period. See the Docker documentation for details. If this limit is exceeded, requests return 429 Too Many Requests.

Step 4: Launch your compute

You can launch compute that uses your custom image using the UI or API. The following requirements must be met:

The compute access mode must be Standard (in the API, set data_security_mode to DATA_SECURITY_MODE_STANDARD). If the compute is set to Dedicated access mode, a different version of Databricks Container Services is used, which expects a different base image and will fail to launch with the base image you built.
The Databricks Runtime version must be 18.3 or above.

note

To launch against an instance pool, the pool must be created with preloaded_docker_images set, and the cluster's docker_image must match. See Use Databricks Container Services with an instance pool before launching.

Launch your compute using the UI

On the Create compute page, ensure the Access mode is set to Standard and Databricks runtime is set to 18.3 or above.
Under Advanced, select the Docker tab.
Select Use your own Docker container.

In the Docker Image URL field, enter your custom image.

Registry	Tag format
Docker Hub	`<organization>/<repository>:<tag>` (for example: `databricksruntime/environment:v5-standard`)
Amazon ECR	`<aws-account-id>.dkr.ecr.<region>.amazonaws.com/<repository>:<tag>`
Azure Container Registry	`<your-registry-name>.azurecr.io/<repository-name>:<tag>`

Select the authentication type. See Docker image authentication.

note

If you do not see the Docker settings when you create compute, Databricks Container Services might not be enabled in your workspace. A workspace admin must enable it before any user can specify a Docker image. See Step 1: Enable Databricks Container Services for standard compute.

Launch your compute using the API

The following is an example API call that creates a standard compute with your custom image. Ensure data_security_mode is set to DATA_SECURITY_MODE_STANDARD and spark_version is set to 18.3.x-scala2.13 or above.

With a public image (no authentication):

Bash
databricks clusters create \
--cluster-name <cluster-name> \
--node-type-id <node-type> \
--json '{
  "num_workers": 1,
  "docker_image": {
    "url": "<docker-registry-image-url>"
  },
  "spark_version": "18.3.x-scala2.13",
  "aws_attributes": {
    "availability": "ON_DEMAND"
  },
  "data_security_mode": "DATA_SECURITY_MODE_STANDARD"
}'

With basic authentication:

Bash
databricks clusters create \
--cluster-name <cluster-name> \
--node-type-id <node-type> \
--json '{
  "num_workers": 1,
  "docker_image": {
    "url": "<docker-registry-image-url>",
    "basic_auth": {
      "username": "<docker-registry-username>",
      "password": "<docker-registry-password>"
    }
  },
  "spark_version": "18.3.x-scala2.13",
  "aws_attributes": {
    "availability": "ON_DEMAND"
  },
  "data_security_mode": "DATA_SECURITY_MODE_STANDARD"
}'

With an instance profile (for Amazon ECR):

Bash
databricks clusters create \
--cluster-name <cluster-name> \
--node-type-id <node-type> \
--json '{
  "num_workers": 1,
  "docker_image": {
    "url": "<image-url>"
  },
  "spark_version": "18.3.x-scala2.13",
  "aws_attributes": {
    "availability": "ON_DEMAND",
    "instance_profile_arn": "arn:aws:iam::<aws-account-number>:instance-profile/<iam-role-name>"
  },
  "data_security_mode": "DATA_SECURITY_MODE_STANDARD"
}'

Docker image authentication

Authentication requirements depend on your Docker image type. You can also use secrets to store authentication usernames and passwords. See Use secrets for authentication.

For public Docker images, you do not need to include authentication information. In the UI, set Authentication to Default. For the API call, do not include the basic_auth fields.
For private Docker images, authenticate using a service principal ID and password (or applicable secrets) as the username and password.
For Azure Container Registry, authenticate using a service principal ID and password (or applicable secrets) as the username and password. See Azure Container Registry service principal authentication documentation for information about creating the service principal.

For Amazon ECR images, do not include authentication information. Instead, launch your compute with an instance profile that includes permissions to pull Docker images from the Docker repository where the image resides. To do this, follow steps 3 and 4 of the process for setting up secure access to S3 buckets using instance profiles.

Here is an example of an IAM role with permission to pull any image. The repository is specified by <arn-of-repository>.

JSON
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["ecr:GetAuthorizationToken"],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetRepositoryPolicy",
        "ecr:DescribeRepositories",
        "ecr:ListImages",
        "ecr:DescribeImages",
        "ecr:BatchGetImage"
      ],
      "Resource": ["<arn-of-repository>"]
    }
  ]
}

If the Amazon ECR image resides in a different AWS account than the Databricks compute, use an ECR repository policy in addition to the compute instance profile to grant the compute access. Here is an example of an ECR repository policy. The IAM role assumed by the compute's instance profile is specified by <arn-of-IAM-role>.

JSON
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCrossAccountPush",
      "Effect": "Allow",
      "Principal": {
        "AWS": "<arn-of-IAM-role>"
      },
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:DescribeImages",
        "ecr:DescribeRepositories",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetRepositoryPolicy",
        "ecr:ListImages"
      ]
    }
  ]
}

Use secrets for authentication

Databricks Container Service supports using secrets for authentication. When creating your compute resource in the UI, use the Authentication field to select Username and password, then instead of entering your plain text username or password, enter your secrets using the {{secrets/<scope-name>/<dcs-secret>}} format. If you use the API, enter the secrets in the basic_auth fields.

For information on creating secrets, see Secret management.

Use Databricks Container Services with an instance pool

To use Databricks Container Services with an instance pool, you must create the pool using the Instance Pools API, not the UI.

The pool must be created with Docker images preloaded. This warms idle instances with your custom image so workloads start faster. Set the preloaded_docker_images field on the request with the same image references and authentication you use when launching compute directly. The field is a list, so a single pool can preload multiple images.

The pool and its attached compute resources must agree on whether Docker is in use. If a pool does not have preloaded_docker_images set, you cannot launch Databricks Container Services compute against it. Create a new pool with preloaded_docker_images set.

For pools created with preloaded_docker_images, any compute resource launched against the pool must supply a matching docker_image in its create request. Otherwise, compute creation fails with 'docker_image' must be provided for cluster created with instance pool: <pool-id>.

Migrate from the original Databricks Container Services

Databricks Container Services for standard compute is a different service from the original Databricks Container Services for dedicated compute. This feature has the following differences:

Workloads execute through the Spark Connect protocol.
Init scripts do not modify your workload's Python environment. You must install all Python dependencies in the Docker image. You can continue using init scripts for applications that consume data from Spark, such as Datadog or Kafka agents.

Support for AWS Graviton instance types.

To migrate from the original Databricks Container Services for dedicated compute, rebuild your custom image on the Databricks Container Services for standard compute and update your compute configuration:

Replace the FROM line in your Dockerfile with FROM databricksruntime/environment:v5-standard (or v5-standard-arm for AWS Graviton).
Port your Dockerfile instructions to the new base image. Standard Dockerfile instructions are supported, with the exceptions listed in Step 2: Build your custom image.
Install Python packages into /databricks/python3 instead of any other virtualenv. Workloads (notebooks, Python wheel jobs, Python script jobs) read from this path.
Update your compute configuration to use Standard access mode and Databricks Runtime 18.3 or above.
Move any Python environment setup that an init script previously performed into the Dockerfile.

Limitations

In addition to the standard compute limitations, Databricks Container Services for standard compute has the following limitations:

Compute-scoped libraries are not supported.
Private package repositories are not supported.
Databricks Runtime for Machine Learning is not supported.
To launch standard compute with Databricks Container Services against an instance pool, the pool must be created with preloaded_docker_images set. See Use Databricks Container Services with an instance pool.

Troubleshooting

If the Docker tab does not appear under Advanced when you create compute, Databricks Container Services is not enabled for your workspace. A workspace admin must enable it in the workspace before any user can specify a Docker image. See Step 1: Enable Databricks Container Services for standard compute.

Requirements​

Step 1: Enable Databricks Container Services for standard compute​

Step 2: Build your custom image​

Step 2a: Pull the base image​

Step 2b: Write a Dockerfile that extends the base image​

Step 2c: Build the image​

Reference: build a minimal base image from scratch​

Step 3: Push your image to a registry​

Step 4: Launch your compute​

Launch your compute using the UI​

Launch your compute using the API​

Docker image authentication​

Use secrets for authentication​

Use Databricks Container Services with an instance pool​

Migrate from the original Databricks Container Services​

Limitations​

Troubleshooting​

Requirements

Step 1: Enable Databricks Container Services for standard compute

Step 2: Build your custom image

Step 2a: Pull the base image

Step 2b: Write a Dockerfile that extends the base image

Step 2c: Build the image

Reference: build a minimal base image from scratch

Step 3: Push your image to a registry

Step 4: Launch your compute

Launch your compute using the UI

Launch your compute using the API

Docker image authentication

Use secrets for authentication

Use Databricks Container Services with an instance pool

Migrate from the original Databricks Container Services

Limitations

Troubleshooting