VcpuLimitExceeded error when creating a GPU ML cluster

You need to increase the vCPU capacity in your AWS account.

Written by Adam Pavlacka

Last published at: November 30th, 2023

 

Problem:

You are trying to create a GPU ML cluster when you get a VcpuLimitExceeded error message. The error message says the requested vCPU capacity exceeds the current vCPU limit of 0 for the instance bucket that the specified instance type belongs to.

Solution:

  1. Check with your AWS team and AWS support team to understand why the VcpuLimitExceeded error is occurring.
  2. Visit http://aws.amazon.com/contact-us/ec2-request and request an adjustment to the vCPU limit.
  3. Increase the capacity in your AWS account to accommodate the requested vCPU capacity.
  4. Retry creating the GPU ML cluster once the vCPU limit has been adjusted and the capacity has been increased.