Problem
When attempting to install libraries using %pip
or %conda
commands on a cluster with SSL encryption enabled, you receive an error.
Py4JJavaError: An error occurred while calling t.getNotebookScopedPythonEnvManager. : org.apache.spark.SparkException: %pip/%conda commands use unencrypted NFS and are disabled by default when SSL encryption is enabled. NFS can be safely used to install libraries that do not contain PHI or other sensitive data, such as open source packages. %pip/%conda commands or NFS should not be used to transmit PHI to Spark workers. To enable %pip/%conda commands, set spark.databricks.libraries.ignoreSSL to true in Spark config in cluster settings and restart your cluster.
Cause
When SSL encryption is enabled on a cluster, %pip
and %conda
commands are disabled by default because they use unencrypted NFS. This is a security measure to prevent the transmission of sensitive data over unencrypted channels.
Solution
- In your cluster, click Advanced Options.
- Navigate to the Spark tab.
- In the Spark config box, enter
spark.databricks.libraries.ignoreSSL true
. - Restart your cluster to apply the new configuration.
Important
You can safely set the spark.databricks.libraries.ignoreSSL
configuration to true when installing open source packages, as long as the packages don’t contain protected health information (PHI) or other sensitive data.
If you have further security concerns, consult with your internal security team for guidance.
For more information, refer to the Encrypt traffic between cluster worker nodes (AWS | Azure | GCP) documentation.