Problem: Databricks Job Fails Because Library is Not Installed

Problem

A Databricks job fails because the job requires a library that is not yet installed, causing Import errors.

Cause

The error occurs because the job starts running before required libraries install. If you run a job on a cluster in either of the following situations, the cluster can experience a delay in installing libraries:

  • When you start an existing cluster with libraries in terminated state.
  • When you start a new cluster that uses a shared library (a library installed on all clusters).

Solution

If a job requires certain libraries, make sure to attach the libraries as dependent libraries within job itself. Refer to the following topic and steps on how to set up dependent libraries when you create a job.

Add libraries as dependent libraries when you create the job.

  1. Open Add Dependent Library dialog:

    ../_images/add-lib-aws.png
  2. Choose library:

    ../_images/choose-lib-aws.png
  3. Verify library:

    ../_images/dep-lib-aws.png