Custom Docker image requires root

Custom Docker containers must be configured to start as the root user when used with Databricks.

Written by dayanand.devarapalli

Last published at: March 4th, 2022

Problem

You are trying to launch a Databricks cluster with a custom Docker container, but cluster creation fails with an error.

{
"reason": {
"code": "CONTAINER_LAUNCH_FAILURE",
"type": "SERVICE_FAULT",
"parameters": {
"instance_id": "i-xxxxxxx",
"databricks_error_message": "Failed to launch spark container on instance i-xxxx. Exception: Could not add container for xxxx with address xxxx. Could not mkdir in container"
              }
          }
}

Cause

Databricks clusters require a root user and sudo.

Custom container images that are configured to start as a non-root user are not supported.

For more information, review the custom container documentation.

Solution

You must configure your Docker container to start as the root user.

Example

This container configuration starts as the standard user ubuntu. It fails to launch.

FROM databricksruntime/standard:8.x
RUN apt-get update -y && apt-get install -y git && \
ln -s /databricks/conda/envs/dcs-minimal/bin/pip /usr/local/bin/pip && \
ln -s /databricks/conda/envs/dcs-minimal/bin/python /usr/local/bin/python
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt .
RUN chown -R ubuntu /app
USER ubuntu

This container configuration starts as the root user. It launches successfully.

FROM databricksruntime/standard:8.x
RUN apt-get update -y && apt-get install -y git && \
ln -s /databricks/conda/envs/dcs-minimal/bin/pip /usr/local/bin/pip && \
ln -s /databricks/conda/envs/dcs-minimal/bin/python /usr/local/bin/python
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt .