Install Turbodbc via init script

Install Turbodbc and its dependencies, libboost-all-dev, unixodbc-dev, and python-dev, with an init script.

Written by John.Lourdu

Last published at: May 11th, 2022

Turbodbc is a Python module that uses the ODBC interface to access relational databases.

It has dependencies on libboost-all-dev, unixodbc-dev, and python-dev packages, which need to be installed in order.

You can install these manually, or you can use an init script to automate the install.

Create the init script

Run this sample script in a notebook to create the init script on your cluster.

%python

dbutils.fs.mkdirs("dbfs:/<path-to-init-script>")
dbutils.fs.put("dbfs:/<path-to-init-script>/turbodbc_install.sh", """
#!/bin/bash
#install dependent packages
sudo apt-get -y install libboost-all-dev unixodbc-dev python-dev
pip install turbodbc==4.1.1
""",True)

Remember the path to the init script. You will need it when configuring your cluster.

Configure the init script

Follow the documentation to configure a cluster-scoped init script (AWS | Azure | GCP).

Specify the path to the init script. Use the same path that you used in the sample script.

After configuring the init script, restart the cluster.