Time travel SELECT query works on older dates even after VACUUM

This is expected behavior but you can also test that your VACUUM command ran successfully.

Written by Shyamprasad Miryala

Last published at: February 7th, 2025

Problem

After you run VACUUM successfully, you notice you can still access and query data further back than the default seven days of retained history with time travel. 

 

Cause

When you run VACUUM,  it removes stale files from the file system but it may not remove all files immediately. When there are still active files at a particular version, time travel can still access those files even though they are beyond the expected retention period.

 

Solution

This is expected VACUUM behavior. To be sure VACUUM did execute when you ran the command, you can test by reading a table’s state as of "<older-date>" and writing it to a temporary table using the following query. 

 

INSERT OVERWRITE <your-temp-table> AS SELECT * FROM <catalog>.<schema>.<table-name> TIMESTAMP AS OF '<older-date>'

 

The query should fail, because it tries to read every Parquet data file at that state, and if some files are already removed by VACUUM, it can’t read those files.

Additionally, you can check the VACUUM history to see if it has indeed deleted the stale files from storage as expected.

For more information, review the Work with Delta Lake table history (AWSAzureGCP) and VACUUM (AWSAzureGCP) documentation.