Parquet table last modification retrieval returns NULL

List the files within the parquet table path and sort them by the modificationTime column.

Written by Shyamprasad Miryala

Last published at: February 7th, 2025

Problem

When you try to use the DESCRIBE DETAIL command to retrieve the last modification data of a Parquet table, it returns NULL

 

Cause

Parquet tables do not have metadata information, such as _delta_log, which is present in Delta tables and is the source used by DESCRIBE DETAIL

 

Solution

Audit columns, add an additional column, or list files and sort to get a last modification date. 

  • If your data contains audit columns like lastModified or lastUpdated, you can use a query to apply the max() function on that column to find the last modified date.
  • Alternatively, you can add an additional column to your Parquet table that populates using current_timestamp() while inserting or updating records. Then, you can apply a query with the max() function to get the value.
  • List the files within the Parquet table path and sort them by the modificationTime column to find when the last file was added or modified.