Running OPTIMIZE on Delta tables causing ConcurrentDeleteDeleteException error

Written by Vidhi Khaitan

Last published at: December 10th, 2024

Problem

When running OPTIMIZE on a Delta table using a job, you receive an error. 

 

[DELTA_CONCURRENT_DELETE_DELETE] ConcurrentDeleteDeleteException: This transaction attempted to delete one or more files that were deleted (for example 2w/part-00004-c745c895-7e13-4f70-9e06-7243bc6c3174.c000.snappy.parquet) by a concurrent update. Please try the operation again.

 

Cause

Two or more jobs are attempting to perform optimization operations on the same table at the same time. This can happen when AUTO OPTIMIZE is enabled and you run a manual OPTIMIZE on the same table.

 

Solution

First, review the conflicting commit error message. If it says “auto”:true then AUTO OPTIMIZE is enabled. 

 

If you need to run OPTIMIZE manually, disable AUTO OPTIMIZE. Since auto compaction and optimized writes are always enabled for MERGE, UPDATE, and DELETE operations, overwrite the functionality by adding the following two configurations.

 

spark.databricks.delta.optimizeWrite.enabled false
spark.databricks.delta.autoCompact.enabled false

 

For more information, review the Configure Delta Lake to control data file size (AWSAzureGCP) documentation. 

 

Important

Databricks generally recommends keeping AUTO OPTIMIZE enabled. Disabling could potentially lead to lower performance/higher costs.