Apache Spark job fails with maxResultSize exception

Refactor your code, increase the driver maxResultSize, or enable Cloud Fetch.

Written by parth.sundarka

Last published at: July 23rd, 2025

Problem

An Apache Spark job fails with a maxResultSize exception. 

org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized
results of XXXX tasks (X.0 GB) is bigger than spark.driver.maxResultSize (X.0 GB)


Cause

This error occurs because the configured size limit was exceeded. The size limit applies to the total serialized results for Spark actions across all partitions. The Spark actions include actions such as collect() to the driver node, toPandas(), or saving a large file to the driver local file system.


Solution

  • First, refactor your code to prevent the driver node from collecting a large amount of data. Change the code so that the driver node collects a limited amount of data or increase the driver instance memory size.


  • If the workload requires you to collect a large amount of data and it cannot be updated, you can update the property spark.driver.maxResultSize to a value <X> GB higher than the value reported in the exception message in the cluster Spark config (AWS | Azure | GCP). 


Delete

Important

Setting spark.driver.maxResultSize to very high values can lead to OOM errors. Use this config only if necessary.


  • If you are using BI tools (such as PowerBI or Tableau, or your own custom application using a JDBC or ODBC driver) and are facing this issue, enable Cloud Fetch in your application. 

    • Cloud Fetch allows the executors to upload the result sets to the workspace root cloud storage (Databricks File System), and then the driver node generates pre-signed URLs for the result set and sends it to the JDBC or ODBC driver. 

    • This workflow allows results to be directly downloaded from cloud storage, avoiding the driver having to use up its memory to collect the results.

    • For more information Cloud Fetch, refer to the Driver capability settings for the Databricks ODBC Driver (AWS | Azure | GCP) documentation.