Databricks cannot access a notebook in GitHub

Check your file type or GitHub credentials and permissions.

Written by david.vega

Last published at: September 12th, 2024

Problem

Your Databricks job may fail to access a notebook in a GitHub repository after previously being able to.

Unable to access the notebook 'resources/notebooks/examplename'. Either it does not exist, or the identity used to run this job, <identity-name> (<identity-reference>), lacks the required permissions.

 

Cause

There are two possible causes.

The first is the notebook was modified from a Databricks notebook to a standard Python script and is missing the necessary notebook identifier.

The second is an issue with the GitHub credentials or permissions associated with the service principal or user running the job.

Solution

If the cause relates to file type

If your .py file is a notebook within the logic, you must ensure that the examplename.py file in the GitHub repository starts with the line # Databricks notebook source

For more information, please review the Export and import Databricks notebooks (AWSAzureGCP) documentation.

If you intend the file to be a standard Python script within the logic, update the job configuration in Databricks to treat the file as a Python script. Change the task type from Notebook to Python script and include the .py extension in the file path. 

For more information, please refer to the Use version-controlled source code in a Databricks job (AWSAzureGCP) documentation. 

If the cause relates to GitHub credentials or permissions

Verify that the GitHub credentials and permissions for the service principal or user running the job are correctly configured. For more information, please review the Configure Git credentials & connect a remote repo to Databricks (AWSAzureGCP) documentation.

Please also refer to the Manage file assets in Databricks Git folders (AWSAzureGCP) and Service principals for CI/CD (AWSAzureGCP) documentation.

Was this article helpful?