Problem
You have an Azure Databricks workspace with a private network storage account and you are setting up Databricks-to-Databricks Delta Sharing with a recipient on another cloud platform to do cross-cloud sharing. The recipient tries to query the shared tables, but gets a file read exception error. The 403 error message details indicate an authorization failure.
Example error message
FileReadException: Error while reading file delta-sharing:/XXXXXXXXXXXXXXXXXXcc.uc-deltasharing%253A%252F%252Fshare_test.sample_schema.sample_table%2523share_test.sample_schema.sample_table_XXXXXXXXXXXXXX/XXXXXXXXXXXXXX/XXXXXX. org.apache.spark.SparkIOException Caused by: SparkIOException: [HDFS_HTTP_ERROR.UNCATEGORIZED] When attempting to read from HDFS, HTTP request failed. HTTP request failed with status: HTTP/1.1 403 This request is not authorized to perform this operation. {"error":{"code":"AuthorizationFailure","message":"This request is not authorized to perform this operation.\nRequestId:XXXXXX\nTime:2024-09-23T13:17:14.8410995Z"}}, while accessing URI of shared table file SQLSTATE: KD00F Caused by: UnexpectedHttpStatus: HTTP request failed with status: HTTP/1.1 403 This request is not authorized to perform this operation. {"error":{"code":"AuthorizationFailure","message":"This request is not authorized to perform this operation.\nRequestId:XXXXXX\nTime:2024-09-23T13:17:14.8410995Z"}}, while accessing URI of shared table file
Cause
The 403 error and AuthorizationFailure
message indicate the recipient’s queries are being blocked by your storage account firewall settings. This typically occurs when the recipient’s workspace egress IP has not been added to the allowlist on your Azure storage account's firewall.
Solution
Classic compute and pro warehouse recipients
Add the egress IP of the recipient’s workspace to your Azure storage account firewall allowlist.
The fixed egress IP of the recipient workspace can be found by inspecting the NAT gateway or the static public IP address attached to the recipient of the Azure Databricks workspace VNET. You can also allow the recipient VNET of the Azure Databricks workspace to directly connect to the storage account if you are not routing traffic through the Internet and are only using Azure service endpoints.
- Log in to the Azure portal.
- Navigate to your storage account where the shared data is residing.
- Click Networking under Security + networking.
- Click Firewalls and virtual networks.
- Under Firewall, add the egress IP to allowlist it on the storage account.
- Save the changes.
Serverless recipients
If the recipient workspace uses serverless compute to query the shared data, its egress traffic originates from the serverless compute plane. To manage this traffic effectively, Databricks recommends creating and attaching a Network Connectivity Configuration (NCC) to the recipient workspace.
Once the NCC is attached, the egress serverless compute traffic from the recipient’s workspace will use a set of stable IP addresses. These IPs can then be allowlisted on the data provider’s Azure Storage account to ensure secure access to the share.
For detailed guidance on obtaining and configuring a list of stable IPs to allowlist, refer to the
Configure a firewall for serverless compute access documentation.