Problem
Apache Spark streaming jobs in Delta Lake may fail with errors indicating that the input schema contains nested fields that are capitalized differently than the target table.
[DELTA_NESTED_FIELDS_NEED_RENAME]
The input schema contains nested fields that are capitalized differently than the target table. They need to be renamed to avoid the loss of data in these fields while writing to Delta.
Spark generally ignoring case in data columns is distinct from this error.
Note
This article applies to Databricks Runtime 14.3 and below.
Cause
While top-level fields in Delta Lake are case insensitive, nested fields must match the case exactly as defined in the table schema.
Solution
Set a specific property in your Spark configuration to handle the case sensitivity of nested fields in Delta tables.
Set the following property in your Spark configuration, which corrects the case of nested field names automatically to match the target table's schema.
spark.conf.set("spark.databricks.delta.nestedFieldNormalizationPolicy", "cast")
For further information, please review the Error classes in Databricks (AWS | Azure) documentation.