Job fails due to cluster manager core instance request limit

Learn how to troubleshoot Databricks errors related to API rate limits.

Written by Adam Pavlacka

Last published at: March 4th, 2022

Problem

A Databricks Notebook or Job API returns the following error:

Unexpected failure while creating the cluster for the job. Cause REQUEST_LIMIT_EXCEEDED: Your request was rejected due to API rate limit. Please retry your request later, or choose a larger node type instead.

Cause

The error indicates the Cluster Manager Service core instance request limit was exceeded.

A Cluster Manager core instance can support a maximum of 1000 requests.

Solution

Contact Databricks Support to increase the limit set in the core instance.

Databricks can increase the job limit maxBurstyUpsizePerOrg up to 2000, and upsizeTokenRefillRatePerMin up to 120. Current running jobs are affected when the limit is increased.

Increasing these values can stop the throttling issue, but can also cause high CPU utilization.

The best solution for this issue is to replace the Cluster Manager core instance with a larger instance that can support maximum data transmission rates.

Databricks Support can change the current Cluster Manager instance type to a larger one.