Problem
Your Apache Spark structured streaming job fails with the following error.
com.databricks.sql.transaction.tahoe.DeltaRuntime Exception: [DELTA_MERGE_MATERIALIZE_SOURCE_FAILED_REPEATEDLY] Keeping the source of the MERGE statement materialized has failed repeatedly.
Cause
Delta Engine tries and fails to optimize a MERGE
statement by materializing the source data in memory. The source data are too large to fit in memory, or there are issues with the underlying storage system.
Solution
Disable source data materialization during MERGE
statement optimization. This way, Delta Engine still optimizes the MERGE
statement but relies on the regular execution plan to process the data.
spark.databricks.delta.merge.materializeSource None