Py4JJavaError when trying to install libraries on SSL-encrypted cluster

Set the Apache Spark configuration to disable SSL.

Written by vidya.sagamreddy

Last published at: January 28th, 2025

Problem

When attempting to install libraries using %pip or %conda commands on a cluster with SSL encryption enabled, you receive an error.

 

Py4JJavaError: An error occurred while calling t.getNotebookScopedPythonEnvManager. : org.apache.spark.SparkException: %pip/%conda commands use unencrypted NFS and are disabled by default when SSL encryption is enabled. NFS can be safely used to install libraries that do not contain PHI or other sensitive data, such as open source packages. %pip/%conda commands or NFS should not be used to transmit PHI to Spark workers. To enable %pip/%conda commands, set spark.databricks.libraries.ignoreSSL to true in Spark config in cluster settings and restart your cluster.

 

Cause

When SSL encryption is enabled on a cluster, %pip and %conda commands are disabled by default because they use unencrypted NFS. This is a security measure to prevent the transmission of sensitive data over unencrypted channels. 

 

Solution

  1. In your cluster, click Advanced Options.
  2. Navigate to the Spark tab.
  3. In the Spark config box, enter spark.databricks.libraries.ignoreSSL true.
  4. Restart your cluster to apply the new configuration.

 

Important

You can safely set the spark.databricks.libraries.ignoreSSL configuration to true when installing open source packages, as long as the packages don’t contain protected health information (PHI) or other sensitive data. 

If you have further security concerns, consult with your internal security team for guidance.

 

 

For more information, refer to the Encrypt traffic between cluster worker nodes (AWSAzureGCP) documentation.