Job timeout when connecting to a SQL endpoint over JDBC

Increase the SocketTimeout value in the JDBC connection URL to prevent thread requests from timing out.

Written by Atanu.Sarkar

Last published at: January 20th, 2023

Problem

You have a job that is reading and writing to an SQL endpoint over a JDBC connection.

The SQL warehouse fails to execute the job and you get a java.net.SocketTimeoutException: Read timed out error message.

2022/02/04 17:36:15 - TI_stg_trade.0 - Caused by: com.simba.spark.jdbc42.internal.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
2022/02/04 17:36:15 - TI_stg_trade.0 - at com.simba.spark.hivecommon.api.TETHttpClient.flushUsingHttpClient(Unknown Source)
2022/02/04 17:36:15 - TI_stg_trade.0 - at com.simba.spark.hivecommon.api.TETHttpClient.flush(Unknown Source)
2022/02/04 17:36:15 - TI_stg_trade.0 - at com.simba.spark.jdbc42.internal.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
2022/02/04 17:36:15 - TI_stg_trade.0 - at com.simba.spark.jdbc42.internal.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)

Cause

Each incoming request requires a thread for the duration of the request. When the number of simultaneous requests is greater than the number of available threads, a timeout can occur. This can occur during long running queries.

Solution

Increase the SocketTimeout value in the JDBC connection URL.

In this example, the SocketTimeout is set to 300 seconds:

jdbc:spark://<server-hostname>:443;HttpPath=<http-path>;TransportMode=http;SSL=1[;property=value[;property=value]];SocketTimeout=300


For more information, review the Building the connection URL for the legacy Spark driver (AWS | Azure | GCP) documentation.