You are part of a machine learning team in a financial services company that builds a model that will perform time series forecasting of stock price movement using SageMaker.
You and your team have finished training the model, and you are now ready to performance test your endpoint to get the best parameters for configuring auto-scaling for your model variant.
How can you most efficiently review the latency, memory utilization, and CPU utilization during the load test?
Click on the arrows to vote for the correct answer
A. B. C. D.Correct Answer: C.
Option A is incorrect.
Using ElasticSearch and Kibana unnecessarily complicates the solution.
A CloudWatch dashboard can show all of the metric data you need to evaluate your model variant.
Option B is incorrect.
You don't need to create custom CloudWatch logs with the metrics you wish to monitor.
All of the metrics (latency, memory utilization, and CPU utilization) you wish to view are generated by CloudWatch by default.
Also, using ElasticSearch and Kibana unnecessarily complicates the solution.
Option C is correct.
The simplest approach is to leverage the CloudWatch dashboard feature since it generates all of the metrics (latency, memory utilization, and CPU utilization) you wish to view by default.
Option D is incorrect.
Using Athena and QuickSight unnecessarily complicates the solution.
A CloudWatch dashboard can show all of the metric data you need to evaluate your model variant.
References:
Please see the Amazon SageMaker developer guide titled Monitor Amazon SageMaker with Amazon CloudWatch (https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-cloudwatch.html),
The Amazon CloudWatch user guide titled Using Amazon CloudWatch Dashboards (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Dashboards.html),
The Amazon CloudWatch user guide titled Creating a CloudWatch Dashboard (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/create_dashboard.html)
The most efficient way to review the latency, memory utilization, and CPU utilization during the load test for the SageMaker model variant is by creating a CloudWatch dashboard.
Option A suggests streaming the SageMaker model variant CloudWatch logs to ElasticSearch and then visualizing and querying the log data in a Kibana dashboard. While this approach may work, it requires setting up ElasticSearch and Kibana, which can be time-consuming and complicated. Also, it is not the most efficient way to review the metrics since CloudWatch already offers a dashboard service that is purpose-built for this use case.
Option B suggests creating custom CloudWatch logs containing the metrics to monitor, then streaming the SageMaker model variant logs to ElasticSearch, and visualizing/querying the log data in a Kibana dashboard. This option is also unnecessarily complicated since it requires customizing CloudWatch logs, setting up ElasticSearch, and Kibana, and then visualizing the data. This approach is time-consuming and adds an unnecessary layer of complexity to the monitoring process.
Option D suggests querying the SageMaker model variant logs on S3 using Athena and leveraging QuickSight to visualize the logs. While this approach can work, it is more complicated than necessary, and it is not as efficient as using the built-in CloudWatch dashboard service.
Therefore, Option C is the best choice since it involves creating a CloudWatch dashboard to show a view of the latency, memory utilization, and CPU utilization metrics of the SageMaker model variant. CloudWatch is integrated with SageMaker, and it is a fully managed service, which makes it easy to set up and use. Creating a CloudWatch dashboard allows for real-time monitoring of the model's performance during the load test, which is essential for determining the best parameters for configuring auto-scaling for the model variant. With the CloudWatch dashboard, you can monitor all relevant metrics in one place, making it easier to identify any performance bottlenecks and take corrective actions quickly.