Problem
Databricks jobs fail, due to a lack of space on the disk, even though storage auto-scaling is enabled.
When you review the cluster event log, you see a message stating that the instance failed to expand disk due to an authorization error.
Instance i-xxxxxxxxx failed to expand disk because: You are not authorized to perform this operation. Encoded authorization failure message: xxxxx(Service: AmazonEC2; Status Code: 403; Error Code: UnauthorizedOperation; Request ID: xxxxxx). Current size: 98.30 GB Current free space: 8.89 GB.
Cause
The underlying AWS account for the workspace does not have the correct permissions to enable Databricks to attach an EBS volume to the instance and then remove it after the cluster is terminated.
Solution
You must add the following permissions to the Databricks workspace deployment IAM role.
ec2:AttachVolume ec2:CreateVolume ec2:DeleteVolume ec2:DescribeVolumes
You can find the Databricks workspace deployment IAM role by logging in to the Account Console and navigating to the AWS account tab.