Access files written by Apache Spark on ADLS Gen1
Problem You are using Azure Databricks and have a Spark job that is writing to ADLS Gen1 storage. When you try to manually read, write, or delete data in the folders you get an error message. Forbidden. ACL verification failed. Either the resource does not exist or the user is not authorized to perform the requested operation Cause When writing data...
1 min reading timeCustom Docker image requires root
Problem You are trying to launch a Databricks cluster with a custom Docker container, but cluster creation fails with an error. { "reason": { "code": "CONTAINER_LAUNCH_FAILURE", "type": "SERVICE_FAULT", "parameters": { "instance_id": "i-xxxxxxx", "databricks_error_message": "Failed to launch spark container on instance i-xxxx. Exception: Could not a...
0 min reading timeJob fails with input/output error when displaying a data frame
Problem While working in a Data Engineering environment using Apache Spark SQL and Delta Lake, your job fails when attempting to display a data frame using the display() command. OSError: [Errno 5] Input/output error: '/Workspace/Repos/dir1/test'. Cause Network security group (NSG) configurations restrict certain ports required for internal commun...
0 min reading timeLog delivery fails with AssumeRole
Problem You are using AssumeRole to send cluster logs to a S3 bucket in another account and you get an access denied error. Cause AssumeRole does not allow you to send cluster logs to a S3 bucket in another account. This is because the log daemon runs on the host machine. It does not run inside the container. Only items that run inside the container...
0 min reading timeLibraries failing due to transient Maven issue
Problem Job fails because libraries cannot be installed. Library resolution failed. Cause: java.lang.RuntimeException: Cannot download some libraries due to transient Maven issue. Please try again later Cause After a Databricks upgrade, your cluster attempts to download any required libraries from Maven. After downloading, the libraries are stored a...
0 min reading timeRemount a storage account after rotating access keys
Problem You have blob storage associated with a storage account mounted, but are unable to access it after access keys are rotated. Cause There are multiple mount points using the same storage account. Remounting some, but not all, of the mount points with new access keys results in access issues. Solution Use dbutils.fs.mounts() to check all mount ...
0 min reading timePermissions error when trying to run job clusters
Problem While attempting to run job clusters using a service principal, you receive the error: You cannot set the job's identity to <SP ID> because you do not have the required permissions. Please contact your workspace administrator or the user who manages the service principal. Additionally, you may see PERMISSION_DENIED: Please contact y...
0 min reading time