List all workspace objects

List all Databricks workspace objects under a given path.

Last published at: May 19th, 2022

You can use the Databricks Workspace API (AWS | Azure | GCP) to recursively list all workspace objects under a given path.

Common use cases for this include:

Indexing all notebook names and types for all users in your workspace.
Use the output, in conjunction with other API calls, to delete unused workspaces or to manage notebooks.
Dynamically get the absolute path of a notebook under a given user, and submit that to the Databricks Jobs API to trigger notebook-based jobs (AWS | Azure | GCP).

Define function

This example code defines the function and the logic needed to run it.

You should place this code at the beginning of your notebook.

You need to replace <token> with your personal access token (AWS | Azure | GCP).

%python

import requests
import json
from ast import literal_eval

# Authorization
headers = {
  'Authorization': 'Bearer <token>',
}

# Define rec_req as a function.
# Note: Default path is "/" which scans all users and folders.

def rec_req(instanceName,loc="/"):
 data_path = '{{"path": "{0}"}}'.format(loc)
 instance = instanceName
 url = '{}/api/2.0/workspace/list'.format(instance)
 response = requests.get(url, headers=headers, data=data_path)
 # Raise exception if a directory or URL does not exist.
 response.raise_for_status()
 jsonResponse = response.json()
 for i,result in jsonResponse.items():
   for value in result:
    dump = json.dumps(value)
    data = literal_eval(dump)
    if data['object_type'] == 'DIRECTORY':
     # Iterate through all folders.
     rec_req(instanceName,data['path'])
    elif data['object_type'] == 'NOTEBOOK':
     # Return the notebook path.
     print(data)
    else:
     # Skip imported libraries.
     pass

Run function

Once you have defined the function in your notebook, you can call it at any time.

You need to replace <instance-name> with the instance name (AWS | Azure | GCP) of your Databricks deployment. This is typically the URL, without any workspace ID.

You need to replace <path> with the full path you want to search. This is typically /.

%python


rec_req("https://<instance-name>", "<path>")

Delete

Info

You should NOT include a trailing / as the last character of the instance name. The function generates an error if a trailing / is included.

Databricks Help Center

Define function

Run function

Info

Contact Us