DBFS init script detection notebook

Scan your workspace for init scripts on DBFS.

Written by Adam Pavlacka

Last published at: March 26th, 2024

On Dec 1, 2023 init scripts stored on DBFS (including legacy global init scripts and cluster-named init scripts) reached End-of-Life.

Databricks recommends you migrate any init scripts stored on DBFS to a supported type as soon as possible.

Review the Recommendations for init scripts (AWS | Azure | GCP) documentation for more information on supported init script types and locations.

Databricks Engineering has created a notebook to help you detect init scripts stored on DBFS across clusters, cluster policies, jobs, DLT pipelines in your workspace.

Instructions

Warning

You must be a Databricks admin to run this notebook.

 

Prerequisites

Before running this notebook, you should complete the following:

  • Have a cluster running Databricks Runtime 13.3 LTS.
  • Import the helpers.zip package into your workspace.
  • Verify that the /helpers folder exists and contains the files __init__.py and checks.py.

 

Info

By default this notebook checks the current workspace. If you want to check additional workspaces, you will need the following:

  • Workspace URL for each workspace you want to test.
  • Administrator PAT (AWS | Azure | GCP) (personal access token) for each workspace you want to test.
 

 

Detect init scripts stored on DBFS

  1. Download the DBFS init script detection notebook.
  2. Import the notebook to your workspace.
  3. Ensure the notebook is in the root of your workspace storage. It should not be in the /helpers folder.
  4. Start a cluster with Databricks Runtime 13.3 LTS
  5. Run the notebook.

 

Once the notebook finishes running, it returns a list of init scripts stored on DBFS in your workspace.

If there are no init scripts stored on DBFS in your workspace, the notebook returns all of the following messages:

No clusters with init scripts on DBFS

No clusters with named init scripts on DBFS

No jobs with init scripts on DBFS

There are no DLT pipelines with init scripts on DBFS

There are no cluster policies with references to init scripts on DBFS

There are no init scripts that reference files on DBFS

 

Check other workspaces for init scripts stored on DBFS

After running the detection script once in your current workspace a widget is visible at the top of the notebook. 

  1. Enter the PAT for the workspace you want to check into the widget.
  2. Enter the Workspace URL for the workspace you want to check into the widget.
  3. Re-run the notebook.

 

The results displayed apply to the workspace specified in the widget.

 

Migrate your init scripts

If the DBFS init script detection notebook detects init scripts you should review the Migrate init scripts from DBFS (AWS | Azure | GCP)  documentation for further details.