Problem
You are using AWS Glue Data Catalog as a metastore when you encounter a job failure in Databricks with the error Could not reach driver of cluster <cluster-id>.
Important
Please note that AWS Glue Data Catalog as a metastore is no longer supported. Databricks recommends Unity Catalog instead. You can learn more about this change in the Databricks blog post, “Prepare Your Journey to Migrate from AWS Glue Data Catalog to Databricks Unity Catalog.”
Cause
The driver REPL (Read-Eval-Print Loop) crashes due to missing permissions.
Solution
Verify the IAM role associated with the Databricks cluster. Ensure that the role has the necessary permissions to access the AWS Glue Data Catalog.
Update the IAM policy attached to the role to include the required permissions and restart the cluster to apply the changes. Then rerun the job to verify that the issue is resolved.
For more information, please review the Use AWS Glue Data Catalog as a metastore (legacy) documentation.
Note
As a general preventive measure, please regularly review and update your IAM policies to ensure they have complete permissions. Additionally, you can monitor job logs for any permission-related errors to proactively address.