S3 connection reset error

Apache Spark job fails with S3 connection reset error.

Last published at: March 15th, 2022

Problem

Your Apache Spark job fails when attempting an S3 operation.

The error message Caused by: java.net.SocketException: Connection reset appears in the stack trace.

Example stack trace from an S3 read operation:

Caused by: javax.net.ssl.SSLException: Connection reset; Request ID: XXXXX, Extended Request ID: XXXXX, Cloud Provider: AWS, Instance ID: i-XXXXXXXX
at sun.security.ssl.Alert.createSSLException(Alert.java:127)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
...
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:833)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
    ...
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90)
at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:90)
...
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:467)
at sun.security.ssl.SSLSocketInputRecord.readFully(SSLSocketInputRecord.java:450)
at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:243)

Cause

The old version of the Hadoop S3 connector does not retry on SocketTimeoutException or SSLException errors. These exceptions can occur when there is a client side timeout or server side timeout, respectively.

Solution

This issue has been resolved in a new version of the Hadoop S3 connector. Databricks Runtime 7.3 LTS and above use the new connector.

If you are using Databricks Runtime 7.3 LTS or above, ensure that these settings DO NOT exist in the cluster’s Spark configuration:

spark.hadoop.fs.s3.impl com.databricks.s3a.S3AFileSystem
spark.hadoop.fs.s3n.impl com.databricks.s3a.S3AFileSystem
spark.hadoop.fs.s3a.impl com.databricks.s3a.S3AFileSystem

If you are using Databricks Runtime 7.0 - 7.2, upgrade to Databricks Runtime 7.3 LTS or above.
If you are using Databricks Runtime 6.4 or below, contact support for assistance.

Databricks Help Center

Problem

Cause

Solution

Contact Us