InvalidSchemaException error when trying to insert data into a Delta table

Define a field type for any fields that use a StructType within a StructField.

Written by lucas.rocha

Last published at: January 30th, 2025


When inserting data into a Delta table with a schema that contains a StructField of type NULL, you encounter an InvalidSchemaException


Example Error Message

Job aborted due to stage failure: Task 0 in stage 25.0 failed 4 times, most recent failure: Lost task 0.3 in stage 25.0 (TID 22) ( executor 0): org.apache.parquet.schema.InvalidSchemaException: Cannot write a schema with an empty group: optional group <field-name> {}



Empty STRUCT fields are not permitted in Parquet format. 

The issue arises when a StructField is defined with an empty StructType. In the following example, the col3 field is defined as a STRUCT with no fields. 


from pyspark.sql.types import StructType, StructField, FloatType
schema = StructType([
    StructField("col1", FloatType(), nullable=True),
    StructField("col2", FloatType(), nullable=True),
    StructField("col3", StructType([]), nullable=True)



Define a field type for any fields that use a StructType within a StructField



schema = StructType([
    StructField("col1", FloatType(), nullable=True),
    StructField("col2", FloatType(), nullable=True),
    StructField("col3", StructType([StructField("nested_col",
StringType())]), nullable=True)


For more information, refer to the What is a view? (AWSAzureGCP) documentation.