Customize containers with Databricks Container Services

Databricks Container Services lets you specify a Docker image when you create a cluster. Some example use cases include:

  • Library customization - you have full control over the system libraries you want installed.
  • Golden container environment - your Docker image is a locked down environment that will never change.
  • Docker CI/CD integration - you can integrate Databricks with your Docker CI/CD pipelines.

You can also use Docker images to create custom deep learning environments on clusters with GPU devices. Refer to the GPU user guide for additional information about using GPU clusters with Databricks Container Services.


  • Databricks Runtime 6.1 or above. If you have previously used Databricks Container Services you must upgrade your base images. Refer to the latest images in tagged with 6.x. Databricks Runtime for Machine Learning and Databricks Runtime for Genomics do not support Databricks Container Services.
  • Your Databricks workspace must have Databricks Containers Services enabled.
  • Your machine must be running a recent Docker daemon (one that is tested and works with Client/Server Version 18.03.0-ce) and the docker command must be available on your PATH.

Step 1: Build your base

There are several minimal requirements for Databricks to launch a cluster successfully. Because of this, we recommend that you build your Docker base from a base that Databricks has built and tested:

FROM databricksruntime/standard:latest

To specify additional Python libraries, such as the latest version of pandas and urllib, use the container-specific version of pip. For the datatabricksruntime/standard:latest container, include the following:

RUN /databricks/conda/envs/dcs-minimal/bin/pip install pandas
RUN /databricks/conda/envs/dcs-minimal/bin/pip install urllib3

Example base images are hosted on Docker Hub at The Dockerfiles used to generate these bases are at


The base images databricksruntime/standard and databricksruntime/minimal are not to be confused with the unrelated databricks-standard and databricks-minimal environments included in the no longer available Databricks Runtime with Conda (Beta).

You can also build your Docker base from scratch. Your Docker image must meet these requirements:

Or, you can use the minimal image built by Databricks at databricksruntime/minimal.

Of course, the minimal requirements listed above do not include Python, R, Ganglia, and many other features that you typically expect in Databricks clusters. To get these features, build off the appropriate base image (that is, databricksruntime/rbase for R), or reference the Dockerfiles in GitHub to determine how to build in support for the specific features you want.


You now have control over the cluster’s environment. With great power comes great responsibility. With great flexibility comes ease of breakage. This document provides several recommendations based on our experiences. Eventually, you will start to step out of known territory, and things may break! As with any Docker workflow, things may not work the first time, or the second, but once they start to work they will always work.

Step 2: Push your base image

Push your custom base image to a Docker registry. This process has been tested with Docker Hub, Amazon ECR, and Azure Container Registry (ACR). Docker registries that support no auth or basic auth are expected to work.

Step 3: Launch your cluster

You can launch your cluster using the UI or the API.

Launch your cluster using the UI

  1. Specify a Databricks Runtime Version that supports Databricks Container Services.

    Select Databricks runtime
  2. Select Use your own Docker container.

  3. In the Docker Image URL field, enter your custom Docker image.

    Docker image URL examples:

    • DockerHub: <organization>/<repository>:<tag>, for example: databricksruntime/standard:latest
    • Azure Container Registry: <your-registry-name><repository-name>:<tag>
    • Elastic Container Registry: <aws-account-id>.dkr.ecr.<region><repository>:<tag>
  4. Select the authentication type.

Launch your cluster using the API

  1. Generate an API token.

  2. Use the Clusters API to launch a cluster with your custom Docker base.

    curl -X POST -H "Authorization: Bearer <token>" https://<databricks-instance>/api/2.0/clusters/create -d '{
      "cluster_name": "<cluster-name>",
      "num_workers": 0,
      "node_type_id": "i3.xlarge",
      "docker_image": {
        "url": "databricksruntime/standard:latest",
        "basic_auth": {
          "username": "<docker-registry-username>",
          "password": "<docker-registry-password>"
      "spark_version": "5.5.x-scala2.11",
      "aws_attributes": {
        "availability": "ON_DEMAND",
        "instance_profile_arn": "arn:aws:iam::<aws-account-number>:instance-profile/<iam-role-name>"

    basic_auth requirements depend on your Docker image type:

    • For public Docker images, do not include the basic_auth field.

    • For private Docker images, you must include the basic_auth field, using a service principal ID and password as the username and password.

    • For Azure ACR, you must include the basic_auth field, using a service principal ID and password as the username and password. See Azure ACR service principal authentication documentation for information about creating the service principal.

    • For Amazon ECR images, do not include the basic_auth field. You must launch your cluster with an instance profile that includes permissions to pull Docker images from the Docker repository where the image resides. To do this, follow steps 3 through 5 of the process for setting up secure access to S3 buckets using instance profiles.

      Here is an example of an IAM role with permission to pull any image. The repository is specified by <arn-of-repository>.

         "Version": "2012-10-17",
         "Statement": [
             "Effect": "Allow",
             "Action": [
           "Resource": "*"
           "Effect": "Allow",
           "Action": [
             "Resource": [ "<arn-of-repository>" ]