Use cx_Oracle to connect to an Oracle server

cx_Oracle is a module that enables access to Oracle Database and conforms to the Python database API specification. Using cx_Oracle requires Oracle Client libraries to be installed. These provide the necessary network connectivity allowing cx_Oracle to access an Oracle Database instance.

Oracle Instant Client enables the development and deployment of applications that connect to Oracle Database, either on-premise or in the Cloud. The Instant Client libraries provide the necessary network connectivity and advanced data features to make full use of Oracle Database. The libraries are used by the Oracle APIs of popular languages and environments including Python, Node.js, Go, PHP and Ruby, as well as providing access for Oracle Call Interface (OCI), Oracle C++ Call Interface (OCCI), JDBC-OCI, ODBC and Pro*C applications.

Since Oracle Instant Client is not included in the Databricks clusters, it has to be installed and environment variables need to be configured in order to connect to an on-prem or cloud oracle server.

If Oracle Instant Client is not installed or the environment variables are not configured correctly, you will get a database error.

DatabaseError: DPI-1047: Cannot locate a 64-bit Oracle Client library: 'libclntsh.so: cannot open shared object file: No such file or directory'.

Install overview

The following steps need to be performed in order to properly install cx_Oracle and the client libraries. Instead of performing these steps manually, you should use an init script.

  1. Download the latest version of the Oracle Instant Client Basic Light Package (ZIP) from the Oracle Instant Client Downloads for Linux x86-64 (64-bit) page.
  2. Unzip the contents to a folder.
  3. Upload the instant client folder to a cluster.
  4. Copy the instant client folder to a system directory.
  5. Set the environment variables LD_LIBRARY_PATH and ORACLE_HOME.
  6. Install cx_Oracle from PyPI.
  7. Restart the cluster.

Create an init script

Use this template to create a cluster-scoped init script that automatically downloads and installs the Oracle Instant Client.

​​​​​dbutils.fs.put("dbfs:/databricks/<init-script-folder>/oracle_ctl.sh","""
#!/bin/bash
wget --quiet -O /tmp/instantclient-basiclite-linuxx64.zip https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip
unzip /tmp/instantclient-basiclite-linuxx64.zip -d /databricks/driver/oracle_ctl/
sudo echo 'export LD_LIBRARY_PATH="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
sudo echo 'export ORACLE_HOME="/databricks/driver/oracle_ctl/"' >> /databricks/spark/conf/spark-env.sh
""", True)

Configure the init script

Install the newly created init script as a cluster-scoped init script.

You will need the full path to the location of the script (dbfs:/databricks/<init-script-folder>/oracle_ctl.sh).

Install cx_Oracle library

Install cx_Oracle as a cluster-installed library.

Restart the cluster

Restart your cluster after cx_Oracle and the client libraries have been installed.

You can now access your Oracle server.