Problem
You are trying to import a module from a wheel file (.whl
) installed as a library. When you conduct the import from a workspace (/Workspace
) on multi-node clusters, the Apache Spark job stalls or fails. You receive the following error.
2025/08/06 09:35:18.473363 [#6664677558339676] workspace.ERROR wsfs/workspace/workspace.go:329 logUnexpectedError:[Driver] CheckRetry(err != nil) statusCode:-1 err:Get "https://10.53.193.60:1017/api/2.0/workspace-files/get-safe-flags": dial tcp 10.53.193.60:1017: i/o timeout
Cause
On executors, /Workspace
is not locally mounted. Workspace Filesystem (WSFS) resolves the import through remote API calls to the driver over ports 1017, 1015, and 1021.
There is network latency or firewall blocking on ports 1017, 1015, and 1021, causing the import process to retry multiple times. The retries can result in the job stalling or failing.
On drivers, /Workspace
is locally mounted. When you run a job on a single-node cluster, imports resolve quickly.
Solution
Allow all traffic on ports 1017,1015, and 1021.
Check your workspace network, firewall, or security group configuration. Ensure ports 1017, 1015, and 1021 are open between executors and the driver and remove any firewall rules blocking traffic on these ports.