You work as a machine learning specialist for a bank.
Your bank management team is concerned about a recent increase in fraudulent transactions.
You need to build a machine learning model to recognize fraudulent transactions in real-time.
The following is a sample of the dataset you are using to train your model: | Transaction ID |Account ID|type|amount |source | … |fraud | |12576477 |37564378|debit |350.00|ATM | … |not_fraud | |39844569 |74897544|credit|756.23|ATM | … |not_fraud | |54986984 |55656753|credit|243.90|ATM | … |undetermined| |34567863 |27564378|debit |1250.00|ATM | … |fraud| You are using the fraudulent feature as your label.
You have decided to use the linear learner built-in algorithm for your model.
Which predictor type should you use for your linear learner?
Click on the arrows to vote for the correct answer
A. B. C. D.Answer: D.
Option A is incorrect.
The binary_classifier predictor type is used when your target feature is binary, either yes or no, 1 or 0, etc.
Your target feature has three potential values.
Therefore it is classified across multiple classes.
Option B is incorrect.
The regressor predictor type is used for regression models where your target feature is a continuous numeric value.
Your target feature has three potential values.
Therefore it is classified across multiple classes.
Option C is incorrect.
The cross_entropy_loss is an option for the binary_classifier_model_selection_criteria parameter of the linear learner SageMaker algorithm.
This parameter is used when you are using a binary classifier as your predictor type.
Option D is correct.
The multiclass_classifier predictor type is used when your target feature has more than two potential values.
Your target feature has three potential values.
Therefore it is classified across multiple classes.
Reference:
Please see the Amazon SageMaker developer guide titled Linear Learner Algorithm, the Amazon SageMaker read the docs API doc titled LinearLearner, and the Amazon Machine Learning developer guide titled Multiclass Classification.
Based on the given scenario, you are tasked with building a machine learning model to recognize fraudulent transactions in real-time. You have a dataset that contains features such as Transaction ID, Account ID, Type, Amount, Source, and Fraud. You have decided to use the linear learner built-in algorithm for your model.
The linear learner algorithm in Amazon SageMaker is used to solve binary classification and regression problems. It can also be used for multi-class classification by training multiple binary classifiers.
Binary classification is a type of classification problem where there are only two possible outcomes, for example, whether a transaction is fraudulent or not fraudulent. In this case, you should use a binary classifier predictor type for your linear learner. The binary classifier predictor type will predict the probability of a transaction being fraudulent or not fraudulent. The threshold for predicting fraud can be set based on the cost of false negatives (i.e., fraud transactions not being detected) versus the cost of false positives (i.e., non-fraud transactions being flagged as fraud).
Regressors are used for predicting continuous values, such as the amount of a transaction. Cross-entropy loss is a loss function used for optimizing models for classification problems, but it is not a predictor type. Multiclass classifiers are used for problems where there are more than two possible outcomes, but in this case, the problem is a binary classification problem.
In summary, the correct answer is A. binary_classifier.