Change the minor version of Python in a cluster

Use an init script to install the desired versions of Python and pyenv.

Written by Adam Pavlacka

Last published at: March 10th, 2025

Problem

You want to change the minor version of Python that is included with the version of Databricks Runtime you have selected.

Info

This method only allows you to update the minor version of Python. You cannot update the major version.

For example, you can update Python 3.11.0 to Python 3.11.11. 

You cannot update Python 3.11.x to Python 3.12.x.

 

 

Cause

Every version of Databricks Runtime ships with a specific version of Python. You can see the included Python version by reviewing the Databricks Runtime release notes versions and compatibility (AWS | Azure |GCP) for your selected Databricks Runtime.

Open the release notes for your selected Databricks Runtime and review the System environment section.

You may have a specific situation where you want to update the Python version, but keep your selected Databricks Runtime.

 

Solution

You can use a cluster-scoped init script (AWS | Azure | GCP) to install an updated version of Python on your cluster when it starts.

This example init script uses the “deadsnakes” repository to install Python and the pyenv Github repo to install the corresponding version of pyenv.

Info

You can use any Python repository (including an internal one) to install the Python binaries. The “deadsnakes” repository used here is one of many available Python sources.

 

You will need to specify the version of Python and the version of pyenv before running the init script.

#!/bin/bash

add-apt-repository -y ppa:deadsnakes/ppa
apt-get update --allow-releaseinfo-change-origin
DEBIAN_FRONTEND=noninteractive apt -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install <python-version-to-install>
DEBIAN_FRONTEND=noninteractive apt -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" --fix-broken install

wget https://github.com/pyenv/pyenv/archive/refs/tags/<pyenv-version-to-install>.tar.gz -O pyenv.tar.gz \
&& tar -xvf pyenv.tar.gz --strip-components 1 -C /databricks/.pyenv \
&& rm pyenv.tar.gz

 

Important

When using standard (formerly shared) access mode clusters, you must add _PIP_USE_IMPORTLIB_METADATA=false to the cluter's Spark config. This is required for library installations to work.

This init script also does not change the UDF Python version on standard access mode clusters, as init scripts are not applied on UDF workers.

 

 

Example - Install Python 3.11.11 and pyenv 2.5.0

This example code builds on the above sample to install the latest version of Python 3.11.x and pyenv 2.5.0.

#!/bin/bash

# install the latest python 3.11 version
add-apt-repository -y ppa:deadsnakes/ppa
apt-get update --allow-releaseinfo-change-origin
DEBIAN_FRONTEND=noninteractive apt -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install python3.11
DEBIAN_FRONTEND=noninteractive apt -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" --fix-broken install

# install pyenv 2.5.0 that supports python 3.11.11
wget https://github.com/pyenv/pyenv/archive/refs/tags/v2.5.0.tar.gz -O pyenv.tar.gz \
&& tar -xvf pyenv.tar.gz --strip-components 1 -C /databricks/.pyenv \
&& rm pyenv.tar.gz