Job deployed with DAB to install packages from private PyPI-compliant repository failing

Configure the PIP_EXTRA_INDEX_URL environment variable as part of the cluster specification in the databricks.yml file.

Written by umakanth.charakanam

Last published at: October 24th, 2025

Problem

When you deploy a job with Databricks Asset Bundles (DAB) that installs Python packages from a private PyPI-compliant repository, the job fails with the following error message.

Run failed with error message
  Library installation failed for library due to user error for pypi {
   package: "<package-name>"
   repo: "<private-repository-url>"
 }
  Error messages:
Library installation attempted on the driver node of cluster <cluster-id> and failed. Pip could not find a version that satisfies the requirement for the library. Please check your library version and dependencies. Error code: ERROR_NO_MATCHING_DISTRIBUTION, error message: org apache spark-SparkException: Process List(/bin/su, Libraries,
-c, bash /local_disk0/.ephemeral_nfs/cluster_libraries/python/python_start_clusterwide.sh /local_disk0/.ephemeral_nf s/cluster_libraries/python/bin/pip install '<package-name>' --index-url <private-repository-url>…
***WARNING: message truncated. Skipped *** bytes of output**

 

Cause

Installing Python packages from a private PyPI-compliant repository forces pip to resolve all dependencies exclusively through that repository.

 

This causes failures when some dependencies are only available on the public PyPI repository or other indexes.

 

Solution

Enable pip to install packages from multiple indexes. Set the PIP_EXTRA_INDEX_URL environment variable as part of the cluster specification in the databricks.yml file.

 

This environment variable mirrors pip’s --extra-index-url option, which allows an additional package index, such as the public PyPI repository, to be searched alongside the private PyPI-compliant repository.

 

Example configuration

targets:
  dev:
    mode: development
    default: true
    resources:
      jobs:
        my_job:
          job_clusters:
            - job_cluster_key:${bundle.target}-${bundle.name}-job-cluster
              new_cluster:
                num_workers: 2
                spark_version: "14.3.x-cpu-ml-scala2.12"
                node_type_id: Standard_F4
				Spark_env_vars:
				
				PIP_EXTRA_INDEX_URL:"{{secrets/<your-scope>/<your-extra-index-url>}}"

                                  

For serverless notebooks and jobs, you can configure PIP_EXTRA_INDEX_URL through the UI and apply it across the entire workspace. 

 

For more details, refer to the “Configure default Python package repositories” section of the Configure the serverless environment (AWSAzureGCP) documentation.