Finest practices for viewing and querying Amazon SageMaker service quota utilization

Amazon SageMaker prospects can view and handle their quota limits by way of Service Quotas. As well as, they’ll view close to real-time utilization metrics and create Amazon CloudWatch metrics to view and programmatically question SageMaker quotas.

SageMaker helps you construct, prepare, and deploy machine studying (ML) fashions with ease. To be taught extra, consult with Getting started with Amazon SageMaker. Service Quotas simplifies restrict administration by permitting you to view and handle your quotas for SageMaker from a central location.

With Service Quotas, you may view the utmost variety of sources, actions, or gadgets in your AWS account or AWS Area. You too can use Service Quotas to request a rise for adjustable quotas.

With the rising utilization of MLOps practices, and subsequently the demand for sources designated for ML mannequin experimentation and retraining, extra prospects must run a number of situations, usually of the identical occasion kind on the similar time.

Many knowledge science groups usually work in parallel, utilizing a number of situations for processing, coaching, and tuning concurrently. Beforehand, customers would typically attain an adjustable account restrict for some specific occasion kind and should manually request a restrict improve from AWS.

To request quota will increase manually from the Service Quotas UI, you may select the quota from the listing and select Request quota improve. For extra info, consult with Requesting a quota increase.

On this put up, we present how you should use the brand new options to routinely request restrict will increase when a excessive stage of situations is reached.

Resolution overview

The next diagram illustrates the answer structure.

This structure contains the next workflow:

  1. A CloudWatch metric screens the utilization of the useful resource. A CloudWatch alarm triggers when the useful resource utilization goes past a sure preconfigured threshold.
  2. A message is shipped to Amazon Simple Notification Service (Amazon SNS).
  3. The message is obtained by an AWS Lambda operate.
  4. The Lambda operate requests the quota improve.

Except for requesting for a quota improve for the particular account, the Lambda operate can even add the quota improve to the organization template (as much as 10 quotas). This fashion, any new account created beneath a given AWS Group has the elevated quota requests by default.


Full the next prerequisite steps:

  1. Arrange an AWS account and create an AWS Identity and Access Management (IAM) consumer. For directions, consult with Secure Your AWS Account.
  2. Set up the AWS SAM CLI.

Deploy utilizing AWS Serverless Software Mannequin

To deploy the appliance utilizing the GitHub repo, run the next command within the terminal:

git clone
cd sagemaker-quotas-alarm
sam construct && sam deploy --stack-name utilization --region us-east-1 --resolve-s3 --capabilities CAPABILITY_IAM --parameter-overrides ResourceUsageThreshold=50 SecurityGroupIds=<SECURITY-GROUP-IDS> SubnetIds=<SUBNETS>

After the answer is deployed, it is best to have a brand new alarm on the CloudWatch console. This alarm screens utilization for SageMaker pocket book situations for the ml.t3.medium occasion.

alarm monitors usage for SageMaker notebook instances

In case your useful resource utilization reaches greater than 50%, the alarm triggers and the Lambda operate requests a rise.

alarm triggers

alarm triggers

If the account you might have is a part of an AWS Group and you’ve got the quota request template enabled, you must also see these will increase on the template, if the template has out there slots. This fashion, new accounts from that group even have the will increase configured upon creation.

increases on the template

Deploy utilizing the CloudWatch console

To deploy the appliance utilizing the CloudWatch console, full the next steps:

  1. On the CloudWatch console, select All alarms within the navigation pane.
  2. Select Create alarm.
    create alarm
  3. Select Choose metric.
    select metric
  4. Select Utilization.
    choose usage
  5. Choose the metric you wish to monitor.
    select metric to monitor
  6. Choose the situation of when you desire to the alarm to set off.

For extra potential configurations when configuring the alarm, see Create a CloudWatch alarm based on a static threshold.

more possible configurations when configuring the alarm

  1. Configure the SNS subject to be notified concerning the alarm.

You too can use Amazon SNS to set off a Lambda operate when the alarm is triggered. See Using AWS Lambda with Amazon SNS for extra info.

Configure the SNS topic

  1. For Alarm title, enter a reputation.
  2. Select Subsequent.
    choose next
  3. Select Create alarm.
    creat alarm

Clear up

To scrub up the sources created as a part of this put up, be certain to delete all of the created stacks. To do this, run the next command:

sam delete --stack-name utilization --region us-east-1


On this put up, we confirmed how you should use the brand new integration from SageMaker with Service Quotas to automate the requests for quota will increase for SageMaker sources. This fashion, knowledge science groups can successfully work in parallel and scale back points associated to unavailability of situations.

You may be taught extra about Amazon SageMaker quotas by accessing the documentation. You too can be taught extra about Service Quotas here.

Concerning the authors

Bruno Klein is a Machine Studying Engineer within the AWS ProServe workforce. He notably enjoys creating automations and bettering the lifecycle of fashions in manufacturing. In his free time, he likes to spend time outdoor and mountain climbing.

Paras Mehra is a Senior Product Supervisor at AWS. He’s centered on serving to construct Amazon SageMaker Coaching and Processing. In his spare time, Paras enjoys spending time along with his household and street biking across the Bay Space. You will discover him on LinkedIn.

Leave a Reply

Your email address will not be published. Required fields are marked *