Problem
When running OPTIMIZE
on a Delta table using a job, you receive an error.
[DELTA_CONCURRENT_DELETE_DELETE] ConcurrentDeleteDeleteException: This transaction attempted to delete one or more files that were deleted (for example 2w/part-00004-c745c895-7e13-4f70-9e06-7243bc6c3174.c000.snappy.parquet) by a concurrent update. Please try the operation again.
Cause
Two or more jobs are attempting to perform optimization operations on the same table at the same time. This can happen when AUTO OPTIMIZE
is enabled and you run a manual OPTIMIZE
on the same table.
Solution
First, review the conflicting commit error message. If it says “auto”:true
then AUTO OPTIMIZE
is enabled.
If you need to run OPTIMIZE
manually, disable AUTO OPTIMIZE
. Since auto compaction and optimized writes are always enabled for MERGE
, UPDATE
, and DELETE
operations, overwrite the functionality by adding the following two configurations.
spark.databricks.delta.optimizeWrite.enabled false
spark.databricks.delta.autoCompact.enabled false
For more information, review the Configure Delta Lake to control data file size (AWS | Azure | GCP) documentation.
Important
Databricks generally recommends keeping AUTO OPTIMIZE
enabled. Disabling could potentially lead to lower performance/higher costs.