Delta table deletion operations intermittently return empty results when data exists

Update your cluster configuration to exclude the optimizer rule.

Written by aishwarya.ghosh

Last published at: October 7th, 2025

Problem

You notice your Delta table deletion operations intermittently return empty results on interactive clusters. The deletions execute successfully, but subsequent queries may show zero rows even when data exists in the table.

 

You may try to restart the cluster, which only solves the issue temporarily. You may also try to manually clear the cache, but it does not consistently resolve the issue either.

 

Cause

Adaptive Query Execution (AQE) in Databricks includes the AQEPropagateEmptyRelation rule, which propagates information about empty datasets to optimize execution. In some Delta table scenarios, this rule can mistakenly flag non-empty tables as empty. 

 

When this happens, other optimizer rules, like OptimizeOneRowPlan and EliminateLimits, remove key operators such as Aggregate and LIMIT, resulting in incomplete or incorrect results. 

 

Solution

Update the cluster configuration to exclude the optimizer rule. 

  1. In your Databricks workspace, open the cluster settings for the interactive cluster experiencing the problem. 
  2. Add the following configuration to the cluster:
spark.databricks.optimizer.adaptive.excludedRules org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation

 

For details on how to apply Apache Spark configs, refer to the “Spark configuration” section of the Compute configuration reference (AWS | Azure | GCP) documentation.

 

After saving the configuration, restart the cluster to apply the changes. 

 

This configuration ensures that the AQEPropagateEmptyRelation rule is excluded from Adaptive Query Execution, preventing it from incorrectly treating non-empty Delta tables as empty and avoiding the removal of critical operators like Aggregate and LIMIT.

 

Important

Excluding AQEPropagateEmptyRelation changes how Spark handles empty relations during query optimization. When excluded, Spark processes all query stages even when inputs appear empty, which may result in slightly higher resource usage and longer query times for genuinely empty datasets. For most workloads, this performance difference is minimal and outweighed by the improved accuracy of Delta table query results.