Problem
When you attempt to run a streaming job against a Delta table, you encounter the following error.
[STREAM_FAILED] Query [id = <query-id>, runId = <run-id>] terminated with exception: [DIFFERENT_DELTA_TABLE_READ_BY_STREAMING_SOURCE] The streaming query was reading from an unexpected Delta table (id = '<id>').
It used to read from another Delta table (id = '<other-id>') according to checkpoint.
Cause
The streaming query checkpoint has stored, and is referencing, a different Delta table ID than the ID of the table you’re trying to read.
The difference in ID indicates the Delta table was recreated or replaced after the last successful run of the streaming query, resulting in a new Delta table ID.
For more information on Delta table IDs, review the Review table details with describe detail (AWS | Azure | GCP) documentation.
Solution
- Update the streaming query configuration to point to a new checkpoint directory. This effectively treats the streaming query as a new stream, allowing it to start reading from the current Delta table.
- With the new checkpoint location, restart the streaming query.
- If the new source table contains data that has already been processed from the old source table, duplicate records need to be managed separately in the target table using deduplicate logic.
For more information, refer to the Structured Streaming checkpoints (AWS | Azure | GCP) documentation.
Best practices
Avoid dropping and recreating or replacing Delta tables to the extent possible. Dropping and recreating or replacing a Delta table changes the table ID and can cause issues with streaming queries that rely on the previous ID.