Scaling-in Issue with SageMaker Automatic Scaling Policy for Production Variant Instances in AWS

Scaling-in Issue

Question

You work as a machine learning specialist for a social media software company.

Your company produces social media apps such as interactive games and photo-sharing communities.

Your machine learning team has created a machine learning model that produces recommendations via advertising in your apps, such as showing advertising for skiing trips to a user who follows a ski resort in the photo-sharing app.

The production variant that your team has deployed experiences very wild swings in traffic volume over the course of any given day.

Also, since the app is relatively new to the mobile community, it receives no traffic for some time periods.

You have set up SageMaker automatic scaling policy for your production variant instances.

However, you have noticed that scaling-in does not happen when your traffic reduces to nothing for a period of time.

Why might this happen?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: A.

Option A is correct.

When your production variant doesn't receive any traffic, SageMaker does not emit any metrics.

Therefore there is no target metric for CloudWatch to use as a trigger to initiate the scale as defined in your scaling policy.

Option B is incorrect.

Even if you used a step scaling policy and set the number of instances to a high number, the policy won't be triggered by CloudWatch because SageMaker does not emit any metrics when traffic is 0 to your production variant.

Option C is incorrect.

Even if you used a step scaling policy and set the number of instances to a low number, the policy won't be triggered by CloudWatch because SageMaker does not emit any metrics when traffic is 0 to your production variant.

Option D is incorrect.

Even if you used a target tracking scaling policy and set the target value for your metric to a low value, the policy won't be triggered by CloudWatch because SageMaker does not emit any metrics when traffic is 0 to your production variant.

References:

Please see the AWS SageMaker developer guide titled Prerequisites (https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-prerequisites.html),

The Application Auto Scaling user guide titled Target tracking scaling policies for Application Auto Scaling (https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-target-tracking.html),

The Application Auto Scaling user guide titled Step scaling policies for Application Auto Scaling (https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html)

The correct answer is A. Scaling-in does not happen when there is no traffic. If traffic to your production variant becomes zero, SageMaker automatic scaling won't scale in. SageMaker doesn't emit metrics with a value of zero, so no CloudWatch events are triggered.

SageMaker is a managed service from Amazon Web Services (AWS) that enables developers to build, train, and deploy machine learning models at scale. SageMaker Automatic Model Tuning (AMT) enables developers to automate the process of training and tuning machine learning models.

In this scenario, the machine learning team has created a model that produces recommendations via advertising in social media apps such as interactive games and photo-sharing communities. The production variant that the team has deployed experiences significant swings in traffic volume over the course of any given day, and sometimes, it receives no traffic for some time periods.

To handle this situation, the machine learning specialist has set up SageMaker automatic scaling policy for the production variant instances. Automatic scaling policy adjusts the number of instances in a production variant based on the traffic demand of the application.

However, the specialist has noticed that scaling-in does not happen when the traffic reduces to nothing for a period of time. This occurs because SageMaker doesn't emit metrics with a value of zero, so no CloudWatch events are triggered. As a result, the automatic scaling policy cannot detect the decrease in traffic and does not scale-in the instances.

Option B and C are incorrect because they refer to Step Scaling policy, which is not mentioned in the scenario. Step Scaling policy is a scaling policy that scales the instances based on a set of scaling adjustments, which could be defined using step functions. In this scenario, the scaling policy used is not specified, so it cannot be assumed to be Step Scaling policy.

Option D is incorrect because it suggests that the Target Tracking policy has been used, which is not mentioned in the scenario. Target Tracking policy is a scaling policy that scales the instances based on a target value set for a particular metric. In this scenario, the scaling policy used is not specified, so it cannot be assumed to be Target Tracking policy.

In conclusion, the correct answer is A. When traffic to the production variant becomes zero, SageMaker automatic scaling policy won't scale-in because SageMaker doesn't emit metrics with a value of zero, and no CloudWatch events are triggered.