Problem
You updated Databricks Runtime from a version below 13.3 LTS, and the following issue now appears in the logs.
Caused by: com.databricks.common.filesystem.InconsistentReadException: The file might have been updated during query execution. Ensure that no pipeline updates existing files during query execution and try again.
Cause
There is a consistency check introduced in Databricks Runtime 13.3 LTS and above.
In earlier versions such as Databricks Runtime 10.4 LTS, Databricks Runtime would read a file between query planning and execution even if it was updated, which could lead to unpredictable results. Databricks Runtime 13.3 LTS now returns an error if a file is updated during these stages to prevent inconsistencies.
Solution
Apply the following configurations to disable file status caching. Disabling file status caching minimizes inconsistencies by reducing the duration files are kept in the cache.
- Set
databricks.loki.fileStatusCache.enabled
tofalse
. - Set
spark.hadoop.databricks.loki.fileStatusCache.enabled
tofalse
.
Note
Reducing the time files are kept in cache reduces the time between file status checks. It does not guarantee that the issue will be resolved during read and write operations.
If the issue persists, please check if another application is updating the file while you are trying to read it.