Problem
Your streaming query fails when reading from a Delta table with a com.databricks.sql.io.FileReadException: Error while reading file
error message.
Example error
com.databricks.sql.io.FileReadException: Error while reading file {file_path}
File {file_path} referenced in the transaction log cannot be found. This occurs when data has been manually deleted
from the file system rather than using the table `DELETE` statement. For more information,
see https://docs.databricks.com/delta/delta-intro.html#frequently-asked-questions
Cause
This can occur when an Apache Spark task tries to read a source file that no longer exists, or if the streaming query takes longer than the time specified in delta.deletedFileRetentionDuration
(default value 7 days).
It can also happen if the file was manually deleted.
Solution
Optimize your streaming query so it completes in less time or increase the delta.deletedFileRetentionDuration
value so it is at least one day longer than the time it takes your query to complete.
For more information on the delta.deletedFileRetentionDuration
property, review the Work with Delta Lake table history (AWS | Azure | GCP) documentation.