Problem
You’re attempting to run an SQL command to enable liquid clustering on your existing Delta table.
ALTER TABLE <table_name> CLUSTER BY (<clustering_columns>)
The command results in an error.
[DELTA_CLUSTERING_COLUMN_MISSING_STATS] Liquid clustering requires clustering columns to have stats. Couldn't find clustering column(s) <clustering_columns>
Cause
The table is missing Delta statistics for the specific set of columns <clustering_columns>
used in the CLUSTER BY
clause.
Solution
To generate Delta statistics, run the following ANALYZE
command. This command generates Delta statistics on the first 32 columns defined in your table schema.
ANALYZE TABLE <table_name> COMPUTE DELTA STATISTICS
To select specific columns (column1,column2,column3
) for Delta statistics generation, and to reduce the ANALYZE
execution time, set the following configuration before running ANALYZE
. This set can also be the same as the clustering keys.
ALTER TABLE <table_name>
SET TBLPROPERTIES (
'delta.dataSkippingStatsColumns' = 'column1,column2,column3'
);