Job fails while installing ODBC Driver 18 for SQL Server using an init script

Add msodbcsql18 to the LD_LIBRARY_PATH then append LD_LIBRARY_PATH path to /etc/environment.

Written by julian.campabadal

Last published at: December 20th, 2024

Problem

When installing ODBC Driver 18 for SQL Server in Databricks compute using an init script, your job intermittently fails with the following error. 
 

Task 4 in stage 2.0 failed 4 times, most recent failure: Lost task 4.3 in stage 2.0 (TID 31) (10.148.45.37 executor 1): SenzingEngineException{errorCode=-2, input='', senzingError='1000E|Unhandled Database Error '(0:01000[unixODBC][Driver Manager]Can't open lib 'ODBC Driver 18 for SQL Server' : file not found)''}

 

Cause

Apache Spark executors and the driver manager (unixODBC) can’t find the shared library file for msodbcsql18 because the library is not added to LD_LIBRARY_PATH

When LD_LIBRARY_PATH, an environment variable the dynamic linker in Linux uses to locate shared libraries, is not set correctly the system can’t find the necessary libraries to load. 

 

Solution

Add LD_LIBRARY_PATH variable to your init script using the following code. 

The first line adds msodbcsql18 to the LD_LIBRARY_PATH for the current session, to help any process started after this point (including your Spark executors) locate the ODBC driver.

The second line appends LD_LIBRARY_PATH path to /etc/environment. Appending ensures: 

  • the updated LD_LIBRARY_PATH is applied system-wide. Any future processes, even if spawned by different users or by different lifecycle events (for example, executor restarts), will inherit the path. 
  • all executors and drivers on the cluster have access to the correct library path, even if they are restarted or scaled dynamically.

 

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/microsoft/msodbcsql18/lib64

echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/microsoft/msodbcsql18/lib64' >> /etc/environment