You are a machine learning specialist working for a large retail clothing company.
Your marketing department wants to leverage machine learning to understand product loyalty.
Your machine learning team has decided to group the customers into categories based on which customers may churn, meaning abandon a given product for another, maybe a competitor's product, within the next 6 months.
You have labeled data for customer product loyalty for the previous two years available to you for training your model. Which machine learning model type should you and your team use to build your customer loyalty model?
Click on the arrows to vote for the correct answer
A. B. C. D.Answer: C.
Option A is incorrect.
Because you have labeled data and you want to group your customers into categories, classification is a better choice than regression.
Option B is incorrect.
Clustering is done using unlabeled data.
You have labeled data.
So you will get better results using a classification algorithm.
Option C is CORRECT.
Since your data is labeled, you will get the best results using a classification algorithm when attempting to classify your customers.
Option D is incorrect.
Reinforcement algorithms are used for solving problems such as supply chain management, HVAC systems, industrial robotics, game artificial intelligence, dialog systems, and autonomous vehicles.
They attempt to learn a strategy.
They are not well suited to grouping data elements into categories.
Reference:
Please see the Amazon SageMaker developer guide titled Use Amazon SageMaker Built-in Algorithms.
Please refer to the article titled Regression vs Classification in Machine Learning: What is The Difference?.
Please review the article titled How to Use Unlabeled Data in Machine Learning.
Please refer to the Amazon SageMaker developer guide titled Use reinforcement learning with Amazon SageMaker.
The appropriate machine learning model for predicting customer churn is a classification model. Therefore, option C is the correct answer.
Classification models are used when the target variable is categorical. In this case, the target variable is customer loyalty, which can either be loyal or churned. The classification model will learn from the labeled data for the previous two years to predict which customers are likely to churn in the next six months.
SageMaker is a managed service provided by Amazon Web Services (AWS) that allows developers and data scientists to build, train, and deploy machine learning models at scale. The Linear Learner and XGBoost are built-in SageMaker algorithms that can be used for classification tasks.
Linear regression, which is a type of regression analysis, is used when the target variable is continuous. It is not suitable for this problem since the target variable, customer loyalty, is categorical.
Clustering, on the other hand, is a type of unsupervised learning used to group similar data points into clusters. It is not a suitable model for this problem since the goal is to predict customer churn, which is a labeled target variable, rather than grouping similar data points.
Finally, reinforcement learning is a type of machine learning used in scenarios where the machine learns by interacting with an environment. It is not a suitable model for this problem since the goal is to predict customer churn, which is a labeled target variable, rather than learning by interacting with an environment.
Therefore, option C - Classification using either the Linear Learner or XGBoost built-in SageMaker algorithms is the best choice for building a customer loyalty model.