spark.rpc.message.maxSize error after running DESCRIBE HISTORY in SQL warehouse

You must limit or prune the query when using DESCRIBE HISTORY on large history logs.

Written by MuthuLakshmi.AN

Last published at: December 6th, 2024

Problem

You try to run DESCRIBE HISTORY in a SQL warehouse but get a max size limit error.

Example query

%sql

SELECT * from (DESCRIBE HISTORY TABLE_NAME)

Example error

Error Message: Serialized task 8992:0 was 274234284 bytes, which exceeds max allowed: spark.rpc.message.maxSize (268435456 bytes). Consider increasing spark.rpc.message.maxSize or using broadcast variables for large values.

Cause

DESCRIBE HISTORY uses a single task to process output.

When using DESCRIBE HISTORY or equivalent commands to process extensive history logs you may encounter limitations.

Solution

If you need to process large history logs with DESCRIBE HISTORY, you can either limit the size of the query or prune using VERSION.

Limit query size

You can limit the size of the query to a specific number of rows. For this example, we are limiting it to 1,000 rows. You should modify this based on your use case and available memory.

%sql

SELECT * from (DESCRIBE HISTORY TABLE_NAME LIMIT 1000)

Prune the query

You can use a filter with VERSION to prune the query and limit it to just the data you need to process.

%sql

SELECT * FROM (DESC HISTORY TABLE_NAME) WHERE version > startVersion AND version < endVersion

Info

Pruning is not supported with TIMESTAMP. If you try to combine the two, the query fails with the same spark.rpc.message.maxSize error.