Problem
Working in a notebook attached to a standard (formerly shared) access mode cluster with a non-ML Databricks Runtime, you’re attempting to use VectorAssembler to combine multiple feature columns into a single feature vector. You encounter the following error.
Py4JError: An error occurred while calling None.org.apache.spark.ml.feature.VectorAssembler. Trace:
py4j.security.Py4JSecurityException: Constructor public org.apache.spark.ml.feature.VectorAssembler(java.lang.String) is not allowlisted.
Cause
Apache Spark Machine Learning Library (MLlib) is not supported in UC-enabled clusters with standard access mode.
Solution
Use a Dedicated (formerly single user) access mode cluster, assigned to a group of users. This feature is currently in public preview, so it needs to be enabled at the workspace level by a workspace admin.
- Click the user icon at the top right of your workspace page, then click Previews.
- Click the Compute: Dedicated group clusters preview toggle to turn on.
After that, you are able to create a dedicated cluster for a group in your workspace. Follow the steps in the Assign compute resources to a group (AWS | Azure | GCP) documentation.