Receiving com.databricks.sql.io.FileReadException: Error while reading file on streaming queries

Ensure the delta.deletedFileRetentionDuration value is longer than the time it takes your query to complete.

Written by raphael.balogo

Last published at: February 7th, 2025

Problem

Your streaming query fails when reading from a Delta table with a com.databricks.sql.io.FileReadException: Error while reading file error message.

 

Example error

com.databricks.sql.io.FileReadException: Error while reading file {file_path} 
File {file_path} referenced in the transaction log cannot be found. This occurs when data has been manually deleted
from the file system rather than using the table `DELETE` statement. For more information,
see https://docs.databricks.com/delta/delta-intro.html#frequently-asked-questions

 

Cause

This can occur when an Apache Spark task tries to read a source file that no longer exists, or if the streaming query takes longer than the time specified in  delta.deletedFileRetentionDuration (default value 7 days).

 

It can also happen if the file was manually deleted. 

 

Solution

Optimize your streaming query so it completes in less time or increase the  delta.deletedFileRetentionDuration value so it is at least one day longer than the time it takes your query to complete.

 

For more information on the delta.deletedFileRetentionDuration property, review the Work with Delta Lake table history (AWSAzureGCP) documentation.