Apache Spark job fails with Failed to parse byte string
Problem
Spark-submit jobs fail with a Failed to parse byte string: -1
error message.
java.util.concurrent.ExecutionException: java.lang.NumberFormatException: Size must be specified as bytes (b), kibibytes (k), mebibytes (m), gibibytes (g), tebibytes (t), or pebibytes(p). E.g. 50b, 100k, or 250m.
Failed to parse byte string: -1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:182)
... 108 more
Caused by: java.lang.NumberFormatException: Size must be specified as bytes (b), kibibytes (k), mebibytes (m), gibibytes (g), tebibytes (t), or pebibytes(p). E.g. 50b, 100k, or 250m.
Failed to parse byte string: -1
Solution
The value assigned to spark.driver.maxResultSize
defines the maximum size (in bytes) of the serialized results for each Spark action. You can assign a positive value to the spark.driver.maxResultSize
property to define a specific size. You can also assign a value of 0 to define an unlimited maximum size. You cannot assign a negative value to this property.
If the total size of a job is above the spark.driver.maxResultSize
value, the job is aborted.
You should be careful when setting an excessively high (or unlimited) value for spark.driver.maxResultSize
. A high limit can cause out-of-memory errors in the driver if the spark.driver.memory
property is not set high enough.
See Spark Configuration Application Properties for more details.