Python 2 sunset status

Learn about the sunset status of Python 2 in Databricks.

Written by Adam Pavlacka

Last published at: May 19th, 2022

Python.org officially moved Python 2 into EoL (end-of-life) status on January 1, 2020.

What does this mean for you?

Databricks Runtime 6.0 and above

Databricks Runtime 6.0 and above support only Python 3. You cannot create a cluster with Python 2 using these runtimes. Any clusters created with these runtimes use Python 3 by definition.

Databricks Runtime 5.5 LTS

When you create a Databricks Runtime 5.5 LTS cluster by using the workspace UI, the default is Python 3. You have the option to specify Python 2. If you use the Databricks REST API (AWS | Azure) to create a cluster using Databricks Runtime 5.5 LTS, the default is Python 2. If you have a Databricks Runtime 5.5 LTS cluster running Python 2, you are not required to upgrade to Python 3.

You can use the following call to specify Python 3 when you create a cluster using the Databricks REST API.

"spark_env_vars": {
  "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},

Should I upgrade to Python 3?

The decision to upgrade depends on your specific circumstances, including reliance on other systems and dependencies. This is a decision that should be made in conjunction with your engineering organization.

The official Python.org statement is as follows:

As of January 1st, 2020 no new bug reports, fixes, or changes will be made to Python 2, and Python 2 is no longer supported. We have not yet released the few changes made between when we released Python 2.7.17 (on October 19th, 2019) and January 1st. As a service to the community, we will bundle those fixes (and only those fixes) and release a 2.7.18. We plan on doing that in April 2020, because that’s convenient for the release managers, not because it implies anything about when support ends.

Support

Databricks does not offer official support for discontinued third-party software.

Support requests related to Python 2 are not eligible for engineering support.