You work for a major retail chain in their web development area.
You are on the machine learning team responsible for building a recommendation engine for its retail website, where they sell many different items across many different categories.
The recommendation engine will use customer data such as purchase history, credit rating, geographic location, household income, response to past marketing mailings, etc.
Your marketing team has decided to send a marketing mailing to customers who have responded to past mailings.
They have two different content templates to use depending on the classification category of each customer.
Your model needs to recommend which mailing template to use for each customer in the target customer dataset. Which SageMaker built-in algorithm is best suited to this problem, and what value should you use for the predictor_type hyperparameter for the desired outcome? (Select TWO)
Click on the arrows to vote for the correct answer
A. B. C. D. E. F. G.Answers: D, G.
Option A is incorrect.
The Linear Learner is best suited for discrete classification problems.
But you have already classified your customers.
You are now trying to provide a discrete recommendation.
The Factorization Machine algorithm is better suited for this type of problem.
Option B is incorrect.
The classifier predictor_type hyperparameter value is not a valid choice for the Factorization Machine algorithm.
The classifier predictor_type hyperparameter value is a valid choice for the K-Nearest-Neighbor algorithm.
Option C is incorrect.
The regressor predictor_type hyperparameter value setting is used for regression type problems and therefore is not the correct choice for this type of problem.
The regressor predictor_type hyperparameter setting is used when you are solving for a quantitative value.
You are trying to solve for a discrete value.
Option D is correct.
The Factorization Machine algorithm is a good choice for problems where you are trying to solve for a discrete recommendation.
Option E is incorrect.
The multiclass_classifier predictor_type hyperparameter value is not a valid choice for the Factorization Machine algorithm.
The multiclass_classifier predictor_type hyperparameter value is a valid choice for the Linear Learner algorithm.
Option F is incorrect.
The K-Means algorithm is best used for grouping observations.
You are trying to solve a discrete recommendation problem.
You are not trying to group customers.
Option G is correct.
The binary_classifier predictor_type hyperparameter value is the correct choice for this discrete recommendation problem where you are attempting to choose one of two possible outcomes (one of the two content templates).
Option H is incorrect.
The Neural Topic Model algorithm is best suited to organizing documents into topics using groupings of words based on their statistical distribution within the documents.
This algorithm is not a good choice for a discrete recommendation problem.
Reference:
Please see the Amazon SageMaker developer guide titled Use Amazon SageMaker Built-in Algorithms, and the AWS Machine Learning blog titled Build a movie recommender with factorization machines on Amazon SageMaker.
The problem at hand is a classification problem, where we need to predict which mailing template to use for each customer based on their features. Thus, the appropriate built-in algorithm to use in Amazon SageMaker is a classification algorithm.
Among the options given, there are three types of built-in algorithms that can be used for classification: binary classifiers, multiclass classifiers, and linear learners.
A binary classifier is used when there are only two classes to predict, while a multiclass classifier is used when there are three or more classes. In this case, we have two different content templates, which means we have two classes to predict. Thus, we need to use a binary classifier.
The appropriate built-in algorithm to use in Amazon SageMaker for this problem is the Linear Learner algorithm. It is a binary classification algorithm that uses linear models and is suitable for large-scale binary classification problems. Linear Learner supports both binary and multiclass classification, but in this case, binary classification is the best choice since we only have two templates to predict.
The predictor_type
hyperparameter for Linear Learner specifies the type of prediction that the model will make. Since we are predicting a binary outcome (which mailing template to use), we should set the predictor_type
hyperparameter to binary_classifier
. This tells the algorithm to output a binary prediction (0 or 1) for each customer, indicating which mailing template to use.
Therefore, the two best options for this problem are A. Linear Learner and G. binary_classifier.