Problem
You try to run DESCRIBE HISTORY
in a SQL warehouse but get a max size limit error.
Example query
%sql
SELECT * from (DESCRIBE HISTORY TABLE_NAME)
Example error
Error Message: Serialized task 8992:0 was 274234284 bytes, which exceeds max allowed: spark.rpc.message.maxSize (268435456 bytes). Consider increasing spark.rpc.message.maxSize or using broadcast variables for large values.
Cause
DESCRIBE HISTORY
uses a single task to process output.
When using DESCRIBE HISTORY
or equivalent commands to process extensive history logs you may encounter limitations.
Solution
If you need to process large history logs with DESCRIBE HISTORY
, you can either limit the size of the query or prune using VERSION
.
Limit query size
You can limit the size of the query to a specific number of rows. For this example, we are limiting it to 1,000 rows. You should modify this based on your use case and available memory.
%sql
SELECT * from (DESCRIBE HISTORY TABLE_NAME LIMIT 1000)
Prune the query
You can use a filter with VERSION
to prune the query and limit it to just the data you need to process.
%sql
SELECT * FROM (DESC HISTORY TABLE_NAME) WHERE version > startVersion AND version < endVersion
Info
Pruning is not supported with TIMESTAMP
. If you try to combine the two, the query fails with the same spark.rpc.message.maxSize
error.