Databricks job fails because library is not installed

Learn how to prevent Databricks jobs from failing due to uninstalled libraries.

Written by Adam Pavlacka

Last published at: May 11th, 2022

Problem

A Databricks job fails because the job requires a library that is not yet installed, causing Import errors.

Cause

The error occurs because the job starts running before required libraries install. If you run a job on a cluster in either of the following situations, the cluster can experience a delay in installing libraries:

  • When you start an existing cluster with libraries in terminated state.
  • When you start a new cluster that uses a shared library (a library installed on all clusters).

Solution

If a job requires certain libraries, make sure to attach the libraries as dependent libraries within job itself. Refer to the following article and steps on how to set up dependent libraries when you create a job.

Add libraries as dependent libraries when you create a job (AWS | Azure).

1. Open Add Dependent Library dialog:

AWS

Arrow pointing to add.

Delete

Azure

Arrow pointing to add.

Delete

2. Choose library:

AWS

Choose a library.

Delete

Azure

Choose a library.

Delete

3. Verify library:

AWS

List of dependent libraries.

Delete

Azure

List of dependent libraries.

Delete