Intermittent long-running OPTIMIZE command for liquid clustered table

Use Databricks Runtime 16.2 or above to run the OPTIMIZE command.

Written by manikandan.ganesan

Last published at: April 29th, 2025

Problem

When you run the OPTIMIZE command on a table with liquid clustering, you notice it sometimes takes several hours instead of a minute or so.

 

You check the table history for the operation metrics using the DESCRIBE HISTORY command and notice that the long-running OPTIMIZE queries compact more files than you usually see. 

 

Cause

Liquid clustering occasionally needs to rebalance its internal metadata to ensure that it closely matches the table state and can be easily updated for new insertions to the table. 

 

In Databricks Runtime versions 16.1 and below, rebalancing requires collecting samples from the entire table, and liquid clustering may have to rewrite files after a metadata update if new metadata differs from the previous state. 

 

This issue doesn’t occur right away on a table where rebalancing has already occurred. If you notice in the Apache Spark UI that the OPTIMIZE query scans all the files in a table, the table is likely undergoing metadata rebalancing. 

 

Solution

Use Databricks Runtime 16.2 or above to run the OPTIMIZE command on your table with liquid clustering.