Install Turbodbc via init script

Turbodbc is a Python module that uses the ODBC interface to access relational databases.

It has dependencies on libboost-all-dev, unixodbc-dev, and python-dev packages, which need to be installed in order.

You can install these manually, or you can use an init script to automate the install.

Create the init script

Run this sample script in a notebook to create the init script on your cluster.

dbutils.fs.put("dbfs:/<path-to-init-script>/", """
#install dependent packages
sudo apt-get -y install libboost-all-dev unixodbc-dev python-dev
pip install turbodbc==4.1.1

Remember the path to the init script. You will need it when configuring your cluster.

Configure the init script

Follow the documentation to configure a cluster-scoped init script.

Specify the path to the init script. Use the same path that you used in the sample script.

After configuring the init script, restart the cluster.