You are a machine learning specialist working for a farming corporation.
Your team is working on a machine learning problem where you are trying to determine the price of a given square plot of corn.
You have many features in your data set that you can use in your model, for example: length of the plot in meters, corn type, corn height, longitude and latitude of the plot, etc. You are trying to understand how you can use the length-of-plot (in meters) feature in your model.
You have chosen to use a Linear Learner algorithm.
When you fit the model to the length-of-plot feature you get results that are not optimal.
As shown by the plot of length to price, you don't get a linear relationship: How can you transform the length-of-plot feature to make it usable in your Linear Learner based model?
Click on the arrows to vote for the correct answer
A. B. C. D.Correct Answer: C.
Option A is incorrect.
Converting the length-of-plot to a different unit of measure, from meters to feet, will not change its relationship to the price.
You'll still have a non-linear relationship.
Option B is incorrect.
The Mutual Information (MI) score is a feature metric that you can use to measure the association of a feature to your target.
It is used to determine which features are most relevant features for your model based on their MI scores (higher is better)
This technique will tell you which features carry promise for making your model perform better, but it won't help you transform your length-of-plot feature.
Option C is correct.
Squaring the length-of-plot to create an area feature will give you a more linear relationship between area and price.
The Linear Learner algorithm works when your features have a linear relationship to the target.
Option D is incorrect.
A length-of-plot_to_corn-height transformation generated feature is unlikely to create a linear relationship to the price target.
Reference:
Please see the Kaggle article titled What Is Feature Engineering (https://www.kaggle.com/ryanholbrook/what-is-feature-engineering), the Amazon SageMaker developer guide titled Linear Learner Algorithm (https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html), the the Towards Data Science article titled Select Features for Machine Learning Model with Mutual Information (https://towardsdatascience.com/select-features-for-machine-learning-model-with-mutual-information-534fe387d5c8)