cx_Oracle is a module that enables access to Oracle Database and conforms to the Python database API specification. Using
cx_Oracle requires Oracle Client libraries to be installed. These provide the necessary network connectivity allowing cx_Oracle to access an Oracle Database instance.
Oracle Instant Client enables the development and deployment of applications that connect to Oracle Database, either on-premise or in the Cloud. The Instant Client libraries provide the necessary network connectivity and advanced data features to make full use of Oracle Database. The libraries are used by the Oracle APIs of popular languages and environments including Python, Node.js, Go, PHP and Ruby, as well as providing access for Oracle Call Interface (OCI), Oracle C++ Call Interface (OCCI), JDBC-OCI, ODBC and Pro*C applications.
Since Oracle Instant Client is not included in the Databricks clusters, it has to be installed and environment variables need to be configured in order to connect to an on-prem or cloud oracle server.
If Oracle Instant Client is not installed or the environment variables are not configured correctly, you will get a database error.
DatabaseError: DPI-1047: Cannot locate a 64-bit Oracle Client library: 'libclntsh.so: cannot open shared object file: No such file or directory'.
The following steps need to be performed in order to properly install
cx_Oracle and the client libraries. Instead of performing these steps manually, you should use an init script.
- Download the latest version of the Oracle Instant Client Basic Light Package (ZIP) from the Oracle Instant Client Downloads for Linux x86-64 (64-bit) page.
- Unzip the contents to a folder.
- Upload the instant client folder to a cluster.
- Copy the instant client folder to a system directory.
- Set the environment variables
- Restart the cluster.
Use this template to create a cluster-scoped init script that automatically downloads and installs the Oracle Instant Client.
dbutils.fs.put("dbfs:/databricks/<init-script-folder>/oracle_ctl.sh",""" #!/bin/bash wget --quiet -O /tmp/instantclient-basiclite-linuxx64.zip https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip unzip /tmp/instantclient-basiclite-linuxx64.zip -d /databricks/driver/oracle_ctl/ sudo echo 'export LD_LIBRARY_PATH="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh sudo echo 'export ORACLE_HOME="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh """, True)
Install the newly created init script as a cluster-scoped init script.
You will need the full path to the location of the script (
cx_Oracle as a cluster-installed library.