Apache Spark job failing with SparkException: Job aborted due to stage failure error on dedicated compute

Execute queries involving fine-grained access control (FGAC) on standard compute.

Last published at: July 24th, 2025

Problem

When you try to run a job on a dedicated compute, it fails with the following error.

SparkException: Job aborted due to stage failure: Total size of serialized results of 2817 tasks (4.0 GiB) is bigger than spark.driver.maxResultSize 4.0 GiB.

The job fails even after increasing the spark.driver.maxResultSize and driver memory to higher value.

Stacktrace

at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)

at com.databricks.sql.execution.arrowcollect.RDDBatchCollector.runSparkJobs(RDDBatchCollector.scala:261)

at com.databricks.sql.execution.arrowcollect.RDDBatchCollector.collect(RDDBatchCollector.scala:347)

at com.databricks.sql.execution.arrowcollect.CloudStoreCollector$.hybridCollect(CloudStoreCollector.scala:159)

at com.databricks.sql.execution.arrowcollect.CloudStoreCollector$.hybridCollect(CloudStoreCollector.scala:206)

at org.apache.spark.sql.execution.qrc.CompressedHybridCloudStoreFormat.collect(cachedSparkResults.scala:170)

at org.apache.spark.sql.execution.qrc.CompressedHybridCloudStoreFormat.collect(cachedSparkResults.scala:160)

at org.apache.spark.sql.connect.execution.SparkConnectPlanExecution.processAsRemoteBatches(SparkConnectPlanExecution.scala:475)

at org.apache.spark.sql.connect.execution.SparkConnectPlanExecution.handlePlan(SparkConnectPlanExecution.scala:141)

Cause

When Fine-Grained Access Control (FGAC) is enabled, queries involving restricted data, such as those protected by row-level security, column masking, or secure views, are offloaded to serverless compute for enforcement. The resulting data must then be fully materialized and transferred back to the dedicated cluster’s driver.

When the query spans a large number of small partitions, Apache Spark triggers an optimized execution path where executors send results directly to the driver, which aggregates and uploads them to cloud storage. If the total serialized result exceeds Spark’s internal 4 GiB driver-side limit, the job fails deterministically, regardless of driver memory or spark.driver.maxResultSize settings.

For details, refer to the Fine-grained access control on dedicated compute (AWS | Azure | GCP) documentation.

Solution

Execute queries involving FGAC on standard compute, where data filtering and access control enforcement are handled within the same compute environment.

Databricks Help Center

Problem

Stacktrace

Cause

Solution

Contact Us