Problem
When attempting to import the Delta Live Tables (DLT) module, you encounter the following error.
“DLTImportException: Delta Live Tables module is not supported on Spark Connect clusters.”
Cause
The file /databricks/python_shell/dbruntime/PostImportHook.py
in the Databricks environment overrides all dlt import
statements to support Delta Live Tables. However, this creates a naming conflict because dlt
is also the name of an open-source PyPI package that certain Python libraries depend on. The issue arises because the Databricks Runtime import hook bypasses the try/except block typically used by these libraries to handle imports gracefully, leading to an import conflict.
Solution
You can address this issue by using a cluster-scoped init script targeting a specific job or cell commands in a notebook.
Use an init script
- Use the workspace file browser to create a new file (AWS | Azure | GCP) in your home directory. Call it
removedbdlt.sh
. - Open the
removedbdlt.sh
file. - Copy and paste this init script into
repos.sh
.
#!/bin/bash
rm -rf /databricks/spark/python/dlt
- Follow the documentation to configure a cluster-scoped init script (AWS | Azure | GCP) as a workspace file.
- Specify the path to the init script. Since you created
removedbdlt.sh
in your home directory, the path should look like/Users/<your-username>/removedbdlt.sh
. - After configuring the init script, restart the cluster.
Use cell commands
Run the following commands in a notebook cell.
%sh
import site
import sys
sys.path.insert(0, site.getsitepackages()[0])
%pip install dlt
import dlt
dlt.__version__