Unable to access Azure Data Lake Storage (ADLS) Gen1 when firewall is enabled

Learn how to troubleshoot access issues when connecting to Azure Data Lake Storage Gen 1 from Databricks with a firewall enabled.

Written by Adam Pavlacka

Last published at: December 9th, 2022

Problem

When you have a firewall enabled on your Azure virtual network (VNet) and you try to access ADLS using the ADLS Gen1 connector, it fails with the error:

328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError(Py4JJavaError:
An error occurred while calling o196.parquet.: java.lang.RuntimeException:
Could not find ADLS Token at com.databricks.backend.daemon.data.client.adl.AdlCredentialContextTokenProvider$$anonfun$get Token$1.apply(AdlCredentialContextTokenProvider.scala:18)
at com.databricks.backend.daemon.data.client.adl.AdlCredentialContextTokenProvider$$anonfun$get
Token$1.apply(AdlCredentialContextTokenProvider.scala:18)
at scala.Option.getOrElse(Option.scala:121)
at com.databricks.backend.daemon.data.client.adl.AdlCredentialContextTokenProvider.getToken(AdlCredentialContextTokenProvider.scala:18)
at com.microsoft.azure.datalake.store.ADLStoreClient.getAccessToken(ADLStoreClient.java:1036)
at com.microsoft.azure.datalake.store.HttpTransport.makeSingleCall(HttpTransport.java:177)
at com.microsoft.azure.datalake.store.HttpTransport.makeCall(HttpTransport.java:91)
at com.microsoft.azure.datalake.store.Core.getFileStatus(Core.java:655)
at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:735)
at com.microsoft.azure.datalake.store.ADLStoreClient.getDirectoryEntry(ADLStoreClient.java:718)
at com.databricks.adl.AdlFileSystem.getFileStatus(AdlFileSystem.java:423)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:94)

Cause

This is a known issue with the ADLS Gen1 connector. Connecting to ADLS Gen1 when a firewall is enabled is unsupported.

Solution

Use ADLS Gen2 instead.