Job fails with input/output error when displaying a data frame

Check your NSG and port settings.

Written by dayanand.devarapalli

Last published at: September 12th, 2024

Problem

While working in a Data Engineering environment using Apache Spark SQL and Delta Lake, your job fails when attempting to display a data frame using the display() command. 

OSError: [Errno 5] Input/output error: '/Workspace/Repos/dir1/test'. 

Cause

Network security group (NSG) configurations restrict certain ports required for internal communications within Databricks workspaces. Specifically, port 1017 is not allowed on the NSG for a workspace, leading to input/output errors when the display() command attempts to access the filesystem. 

Solution

  1. Ensure that all necessary ports for internal communications are allowed in your NSG for your Databricks workspace. 
  2. Specifically verify that port 1017 is open for both UDP and TCP traffic.
  3. Refer to the Configure a customer-managed VPC (AWSAzureGCP) documentation section on security groups to configure your NSG to your needs.
  4. After updating the NSG settings, restart the affected job and verify that the issue is resolved.

If your job is still failing, review the backend logs for any additional errors or timeouts and adjust the NSG settings accordingly.

Note

For future prevention, ensure that the NSG configurations are periodically reviewed and updated to accommodate any changes in the Databricks environment or network requirements. Additionally, consider setting up monitoring and alerts for network-related issues to proactively address potential disruptions.