Error when importing OneHotEncoderEstimator

Problem

You have migrated a notebook from Databricks Runtime 6.4 for Machine Learning or below to Databricks Runtime 7.3 for Machine Learning or above.

You are attempting to import OneHotEncoderEstimator and you get an import error.

ImportError: cannot import name 'OneHotEncoderEstimator' from 'pyspark.ml.feature' (/databricks/spark/python/pyspark/ml/feature.py)

Cause

OneHotEncoderEstimator was renamed to OneHotEncoder in Apache Spark 3.0.

Solution

You must replace OneHotEncoderEstimator references in your notebook with OneHotEncoder.

For example, the following sample code returns an import error in Databricks Runtime 7.3 for Machine Learning or above:

from pyspark.ml.feature import OneHotEncoderEstimator

The following sample code functions correctly in Databricks Runtime 7.3 for Machine Learning or above:

from pyspark.ml.feature import OneHotEncoder