Problem
When running OPTIMIZE
on a Delta table using a job, you receive an error.
[DELTA_CONCURRENT_DELETE_DELETE] ConcurrentDeleteDeleteException: This transaction attempted to delete one or more files that were deleted (for example 2w/part-00004-c745c895-7e13-4f70-9e06-7243bc6c3174.c000.snappy.parquet) by a concurrent update. Please try the operation again.
Cause
Two or more jobs are attempting to perform optimization operations on the same table at the same time. There are two case types.
- An
AUTO OPTIMIZE
is done, and a subsequent manualOPTIMIZE
conflicts. - You perform a manual
OPTIMIZE
, and a subsequentAUTO OPTIMIZE
conflicts.
Solution
First, check the conflicting commit message to verify which case is occurring.
If the conflicting commit message says "auto": true
, this indicates an AUTO OPTIMIZE
job clashed with a manual OPTIMIZE
.
If the conflicting commit message says "auto": false
, the manual operation clashed with an earlier auto-triggered one. Further confirm by running DESCRIBE HISTORY <table-name>
. Look for consecutive OPTIMIZE
operations — one with auto=true
and the other auto=false
— around the timestamp in the error.
The solution remains the same for both cases. If you need to run OPTIMIZE
manually, disable AUTO OPTIMIZE
. Since auto compaction and optimized writes are always enabled for MERGE
, UPDATE
, and DELETE
operations, overwrite the functionality by adding the following two configurations.
spark.databricks.delta.optimizeWrite.enabled false
spark.databricks.delta.autoCompact.enabled false
For more information, review the Configure Delta Lake to control data file size (AWS | Azure | GCP) documentation.
Important
Databricks generally recommends keeping AUTO OPTIMIZE
enabled. Disabling could potentially lead to lower performance/higher costs.