Iceberg metadata not reflecting latest changes made to table, appearing out of sync

Add `MSCK REPAIR TABLE SYNC METADATA` to the end of the ingestion job, or run manually after each ingestion.

Written by sidhant.sahu

Last published at: April 28th, 2025

Problem

When using UniForm to share datasets with clients who do not have a Databricks instance, you notice your Iceberg metadata does not reflect the latest changes made to the Delta table.

 

You observe the issue when ingesting data to Iceberg tables using in-house pipelines, where the Iceberg metadata reflects a one-change delay.

 

Cause

When data is ingested into a Delta table, the changes are immediately reflected in the Delta log. However, the Iceberg metadata update is an asynchronous process that takes a minimum of a few hours after the ingestion job is completed to reflect changes. 

 

Your job terminates right after pushing the Delta changes, and the async Iceberg metadata update does not have enough time to complete. 

 

Solution

Add the MSCK REPAIR TABLE <table-name> SYNC METADATA command at the end of the ingestion job to ensure the Iceberg metadata updates synchronously with each ingestion, preventing delays in reflecting the latest changes. 

 

Alternatively, you can manually run the MSCK REPAIR TABLE <table-name> SYNC METADATA command after each ingestion.  

 

For details, refer to the “Manually trigger Iceberg metadata conversion” section of the Read Delta tables with Iceberg clients (AWSAzureGCP) documentation.