Problem
You are trying to pip install the ArcGIS library on a cluster and get the following error.
line 159, in _get_build_requires
self.run_setup()
File "/databricks/python/lib/python3.9/site-packages/setuptools/build_meta.py", line 174, in run_setup
exec(compile(code, __file__, 'exec'), locals())
File "setup.py", line 109, in <module>
link_args = shlex.split(get_output(f"{kc} --libs gssapi"))
File "setup.py", line 22, in get_output
res = subprocess.check_output(*args, shell=True, **kwargs)
File "/usr/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'krb5-config --libs gssapi' returned non-zero exit status 127.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
Cause
The ArcGIS package requires native components that link against Kerberos (GSSAPI). During installation, the ArcGIS package invokes krb5-config
, which is missing from the Databricks environment.
krb5-config
is provided by the libkrb5-dev
package, and its absence causes the installation to fail with a subprocess-exited-with-error
.
Solution
From the UI, create a cluster-scoped init script to install the required libkrb5-dev
binary package. The script is stored in your workspace filesystem, though you may optionally store it in S3 or a Unity Catalog volume.
- From your workspace landing page, navigate to your home folder.
- Click Create in the top-right corner of the page. Then select File.
- Add the following script to the editor and save the file as
arcgis_requirements.sh
#!/bin/bash
# Install all the subdependencies packages required for ArcGIS
# Remove all cached package lists to ensure a fresh update.
sudo rm -rf /var/lib/apt/lists/*
# Update the local package index to get the latest package information.
sudo apt-get -y update
# Install required packages for Kerberos
sudo apt install -y libkrb5-dev
Next, attach the init script to your cluster.
- Navigate to Compute > your cluster and click Edit to edit the cluster.
- Expand the Advanced options and click the Init scripts tab.
- In the Source drop-down, select Workspace.
- In the File path, select the path to the script
arcgis_requirements.sh
- Click Add, then Confirm and restart.
Last, install ArcGIS from the cluster UI.
- On the cluster page, click Libraries. Then click Install new.
- Under Library Source, select PyPi.
- Provide the package name as
arcgis==<version>
- Click Install.
For additional information on init scripts, review the Cluster-scoped init scripts (AWS | Azure | GCP) documentation.
For additional information on ArcGIS dependencies, refer to the esri Developer System requirements documentation.