RESOURCE_LIMIT_EXCEEDED error when querying a Delta Sharing table

Delta Sharing has limits on the metadata size of a shared table. If you exceed these limits it generates an error.

Written by Rajeev kannan Thangaiah

Last published at: April 19th, 2023

Problem

You are querying a Delta table shared in Delta Sharing and you get a RESOURCE_LIMIT_EXCEEDED error.

The error may reference a timeout.

io.delta.sharing.spark.util.UnexpectedHttpStatus: HTTP request failed with status: HTTP/1.1 400 Bad Request {"errorCode":"RESOURCE_LIMIT_EXCEEDED","message":"A timeout occurred when processing the table. If it continues to happen, please contact your data provider to share a smaller table instead."}.

The error may also reference table metadata that exceeds size limits.

{ 'errorCode': 'RESOURCE_LIMIT_EXCEEDED',
'message': 'The table metadata size exceeded limits'}

Cause

Delta Sharing has limits on the metadata size of a shared table.

  • You are limited to 700k AddFiles actions in the DeltaLog. This is how many active files you can have in a shared Delta table.
  • You are limited to 100k RemoveFiles actions in the DeltaLog. This is the number of files that have been deleted. This includes files that have been removed by operations like OPTIMIZE and MERGE.

Solution

You can run OPTIMIZE (AWS | Azure | GCP) on the shared Delta table to reduce the number of active files.

OPTIMIZE table_name [WHERE predicate]
  [ZORDER BY (col_name1 [, ...] ) ]


After running OPTIMIZE to reduce the number active files, you may hit the RemoveFiles limit if the OPTIMIZE command removed more than 100K files.

If this is the case, you can temporarily lower the delta.logRetentionDuration property. This lowers the length of time items stay in the DeltaLog. By setting it to a short retention time, for example 24 hours, the transaction log is quickly purged, which helps you stay under the 100K RemoveFiles limit.

%sql

ALTER TABLE <table-name>
SET TBLPROPERTIES ('delta.logRetentionDuration'='24 hrs')
Delete

Warning

Lowering the delta.logRetentionDuration property also reduces your ability to time travel. You can only time travel if the metadata is contained in the DeltaLog. If the log retention is set to 24 hours, you can only time travel back 24 hours.


Once the issue is resolved, you should revert the delta.logRetentionDuration property back to 30 days, so you can continue to use the time travel feature.

%sql

ALTER TABLE <table-name>
SET TBLPROPERTIES ('delta.logRetentionDuration'='30 days')


To prevent the issue from reoccurring, you should run OPTIMIZE periodically. This helps keep the number of active files below the limit.