Vector search index does not sync or update, and gives error EXECUTION_SERVICE_STARTUP_FAILURE.STORAGE_PERMISSION_ISSUE

Update the Azure Storage firewall settings to allow Databricks serverless compute subnets.

Written by saikumar.divvela

Last published at: October 1st, 2024

Problem 

When attempting to create a vector search index on your Delta tables, you notice the index will not sync or update. You receive an error in your DLT pipeline.

 

com.databricks.pipelines.common.CustomException: [DLT ERROR CODE: EXECUTION_SERVICE_STARTUP_FAILURE.STORAGE_PERMISSION_ISSUE]
Operation failed: "This request is not authorized to perform this operation."

Caused by: com.databricks.pipelines.common.CustomException: [DLT ERROR CODE: EXECUTION_SERVICE_STARTUP_FAILURE.STORAGE_PERMISSION_ISSUE] Operation failed: "This request is not authorized to perform this operation.", 403, GET, https://xxxxxxxxx.dfs.core.windows.net/honinstallbase?upn=false&resource=filesystem&maxResults=5000&directory=igs/__unitystorage/schemas/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/tables/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/_delta_log&continuation=xxxxxxxxxxx=&timeout=90&recursive=false&st=2024-07-21T13:04:55Z&sv=2020-02-10&ske=2024-07-21T15:04:55Z&sig=XXXXX&sktid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&se=2024-07-21T14:28:39Z&sdd=6&skoid=xxxxxxxx-xxxx-xxxxxxxxxxxxxxxxxxxxxx&spr=https&sks=b&skt=2024-07-21T13:04:55Z&sp=rl&skv=2020-10-02&sr=d, AuthorizationFailure, "This request is not authorized to perform this operation. RequestId:44ea689f-e01f-0076-7671-db2675000000 Time:2024-07-21T13:28:42.2309941Z"

Cause 

Enabling an Azure Storage firewall on your storage account prevents serverless compute from accessing the account. Without access to the account, there is no access to the managed DLT pipeline. 

Solution

Update your storage account firewall settings to allow the Databricks serverless compute subnets, then retrigger the DLT pipeline. 

 

For instructions, please review the Configure a firewall for serverless compute access documentation.

Note

Anytime you want to access a storage account from a DLT pipeline using serverless compute, and the account has a firewall enabled, you need to configure the firewall to allow serverless to access the account.