Problem
You notice your Delta table deletion operations intermittently return empty results on interactive clusters. The deletions execute successfully, but subsequent queries may show zero rows even when data exists in the table.
You may try to restart the cluster, which only solves the issue temporarily. You may also try to manually clear the cache, but it does not consistently resolve the issue either.
Cause
Adaptive Query Execution (AQE) in Databricks includes the AQEPropagateEmptyRelation
rule, which propagates information about empty datasets to optimize execution. In some Delta table scenarios, this rule can mistakenly flag non-empty tables as empty.
When this happens, other optimizer rules, like OptimizeOneRowPlan
and EliminateLimits
, remove key operators such as Aggregate
and LIMIT
, resulting in incomplete or incorrect results.
Solution
Update the cluster configuration to exclude the optimizer rule.
- In your Databricks workspace, open the cluster settings for the interactive cluster experiencing the problem.
- Add the following configuration to the cluster:
spark.databricks.optimizer.adaptive.excludedRules org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation
For details on how to apply Apache Spark configs, refer to the “Spark configuration” section of the Compute configuration reference (AWS | Azure | GCP) documentation.
After saving the configuration, restart the cluster to apply the changes.
This configuration ensures that the AQEPropagateEmptyRelation rule is excluded from Adaptive Query Execution, preventing it from incorrectly treating non-empty Delta tables as empty and avoiding the removal of critical operators like Aggregate and LIMIT.
Important
Excluding AQEPropagateEmptyRelation
changes how Spark handles empty relations during query optimization. When excluded, Spark processes all query stages even when inputs appear empty, which may result in slightly higher resource usage and longer query times for genuinely empty datasets. For most workloads, this performance difference is minimal and outweighed by the improved accuracy of Delta table query results.