Error when importing OneHotEncoderEstimator

You get an error message when trying to import OneHotEncoderEstimator.

Written by Shyamprasad Miryala

Last published at: May 16th, 2022

Problem

You have migrated a notebook from Databricks Runtime 6.4 for Machine Learning or below to Databricks Runtime 7.3 for Machine Learning or above.

You are attempting to import OneHotEncoderEstimator and you get an import error.

ImportError: cannot import name 'OneHotEncoderEstimator' from 'pyspark.ml.feature' (/databricks/spark/python/pyspark/ml/feature.py)

Cause

OneHotEncoderEstimator was renamed to OneHotEncoder in Apache Spark 3.0.

Solution

You must replace OneHotEncoderEstimator references in your notebook with OneHotEncoder.

For example, the following sample code returns an import error in Databricks Runtime 7.3 for Machine Learning or above:

%python

from pyspark.ml.feature import OneHotEncoderEstimator

The following sample code functions correctly in Databricks Runtime 7.3 for Machine Learning or above:

%python

from pyspark.ml.feature import OneHotEncoder