Problem
Your Auto Loader streaming job fails with an UnknownFieldException
error when a new column is added to the source file of the stream.
Exception: org.apache.spark.sql.catalyst.util.UnknownFieldException: Encountered unknown field(s) during parsing: <column name>
Cause
An UnknownFieldException
error occurs when Auto Loader detects the addition of new columns as it processes incoming data.
The addition of a new column causes the stream to stop and generates an UnknownFieldException
error.
Solution
Set your Auto Loader stream to use schema evolution to avoid this issue.
For more information, review the How does Auto Loader schema evolution work? (AWS | Azure | GCP) documentation.