You work as a machine learning specialist for the infectious disease testing department of a national government agency.
Your machine learning team is responsible for creating a machine learning model that analyzes the daily test datasets for your country and produces daily predictions of trends of disease contraction and death rates.
These projections are used throughout national and international news agencies to report on the daily projections of infectious disease progression.
Since your model works on huge datasets on a daily basis, which of the following statements gives an accurate description of your inference processing?
Click on the arrows to vote for the correct answer
A. B. C. D.Correct Answer: C.
Option A is incorrect.
SageMaker batch transform does not use a persistent endpoint.
You use SageMaker batch transform to get inferences from large datasets.
Also, your process runs one per day, so a persistent endpoint does not make sense.
Option B is incorrect.
You are processing large datasets on a daily basis.
Therefore you should use SageMaker batch transform, not SageMaker hosting services.
SageMaker hosting services are used for real-time inference requests, not daily batch requests.
Option C is correct.
Since you are using your endpoint to get inferences one per day from a large dataset, you don't need a persistent endpoint.
Also, SageMaker batch transform is the best deployment option when getting inferences from an entire dataset.
Option D is incorrect.
SageMaker hosting services need a persistent endpoint.
Also, since you are processing large datasets on a daily basis, you should use SageMaker batch transform, not SageMaker hosting services.
References:
Please see the AWS SageMaker developer guide titled Use Batch Transform (https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html),
The AWS SageMaker developer guide titled Deploy Models for Inference (https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html)
In this scenario, the machine learning team is responsible for creating a model that analyzes the daily test datasets and produces daily predictions of trends of disease contraction and death rates. These projections are used by news agencies to report on the daily projections of infectious disease progression. Since the model works on huge datasets on a daily basis, it is essential to have an efficient inference processing system.
The best option among the given options is B. The team has set up a persistent endpoint to get predictions from the model using Amazon SageMaker hosting services.
Amazon SageMaker is a fully managed service that provides developers and data scientists with the ability to build, train, and deploy machine learning models at scale. It provides various options for inference processing, including batch transform and hosting services.
Batch transform is used when we have a large amount of data that needs to be processed in batches. We can use SageMaker batch transform to run inferences on large datasets and get the results in batches. However, it is not suitable for real-time or near real-time use cases.
On the other hand, hosting services provide a persistent endpoint that can be used to get predictions in real-time or near real-time. The endpoint can be called using an API or SDK, and it provides a scalable and efficient way to get predictions from the model. Since the scenario requires daily predictions that are used by news agencies, a persistent endpoint using SageMaker hosting services is the best option.
Therefore, option B is the correct answer: B. You have set up a persistent endpoint to get predictions from your model using SageMaker hosting services.