You are a machine learning specialist at a pharmaceutical drug manufacturer.
Your team has the task of building a deep learning model to be used for drug discovery by combining data from various sources.
Your team will use the deep learning model to predict novel candidate biomolecules for new disease targets such as COVID 19.You and your team have created a deep learning neural network disease target prediction model that performs well on the training data.
However, it performs poorly on your test data. Which of the methods listed should your team use to correct your testing problem? (SELECT THREE)
Click on the arrows to vote for the correct answer
A. B. C. D. E. F.Answers: A, C and E.
Option A is CORRECT.
When your model performs well in training but poorly in testing, it quite often means the model is overfitted or not generalized enough.
Increasing dropout, or the probability of a node to be turned off, helps generalization.
Option B is incorrect.
You need to increase regularization to avoid overfitting, a common problem when your training data is “memorized” by your neural network.
Decreasing regularization is used when your model is underfitting.
Option C is CORRECT.
You need to increase regularization to avoid overfitting, a common problem when your training data is “memorized” by your neural network.
Increasing regularization helps generalization.
Option D is incorrect.
Decreasing dropout will not help generalization.
It will have the opposite effect.
Option E is CORRECT.
Decreasing feature combinations and combining the strength of multiple complementary features to yield more powerful features (reducing the number of features) will help with generalization.
Option F is incorrect.
Increasing feature combinations is used when your model is underfitting.
Decreasing feature combinations, combining the strength of multiple complementary features to yield more powerful features (reducing the number of features) will help with generalization.
Reference:
Please see the Amazon Machine Learning developer guide titled Model Fit: Underfitting vs.
Overfitting.
Please refer to the Machine Learning Mastery article titled A Gentle Introduction to Dropout for Regularizing Deep Neural Networks.
When a deep learning model performs well on training data but poorly on test data, this indicates overfitting. Overfitting is a common problem in machine learning models where the model is too complex and learns the noise in the data instead of the underlying patterns. To correct the overfitting problem, the following methods can be used:
A. Increase dropout: Dropout is a regularization technique that randomly drops out some neurons in the neural network during training. This technique prevents overfitting by forcing the network to learn more robust features. By increasing the dropout rate, the network will drop more neurons during training, making the model more generalized and less prone to overfitting.
C. Increase regularization: Regularization is a technique used to reduce overfitting in machine learning models. It adds a penalty term to the loss function, which penalizes large weights in the model. By increasing the regularization parameter, the model will be penalized more for large weights, making the model less complex and less prone to overfitting.
F. Increase feature combinations: Feature engineering is an essential part of machine learning. The quality of features used to train the model can greatly affect the performance of the model. By increasing the feature combinations, the model will have more features to learn from, making the model more robust and less prone to overfitting.
On the other hand, the following methods can exacerbate the overfitting problem:
B. Decrease regularization: Decreasing the regularization parameter will reduce the penalty term in the loss function, allowing the model to have larger weights. This will make the model more complex and prone to overfitting.
D. Decrease dropout: Decreasing the dropout rate will drop fewer neurons during training, making the model less generalized and more prone to overfitting.
E. Decrease feature combinations: Reducing the number of features used to train the model will make the model less robust and more prone to overfitting.
In conclusion, to correct the overfitting problem, increasing dropout, increasing regularization, and increasing feature combinations should be used, while decreasing regularization, decreasing dropout, and decreasing feature combinations should be avoided.