While setting up your machine learning experiments, you need to ensure that the trained models will be appropriately scored with validation data.
Azure ML SDK provides several methods to specify in your scripts how to determine the data to be used for validation.
Which is not a valid set of parameters for specifying validation method? Select two!
Click on the arrows to vote for the correct answer
A. B. C. D. E.Answers: C and E.
Option A is incorrect because the primary metric must be given in all cases.
When only training data is provided, the default splitting rules will be applied.
Option B is incorrect because the primary metric is always required.
When both training and validation data are provided, these will be used, since no automatic splitting is needed.
Option C is CORRECT because primary metric and training data are always required.
Validation data either can be provided explicitly, or can be generated automatically (by default or explicit splitting rules).
Option D is incorrect because when only training data is provided, with validation set size set manually, this value will be used to split data into training and validation subsets.
Option E is CORRECT because it is a kind of redundancy because either validation data or training data with validation set size can be set.
Setting both validation data and validation set size is wrong.
Reference:
When setting up machine learning experiments, it is essential to validate the trained models to ensure that they perform well on unseen data. Azure ML SDK offers several methods to specify how to determine the data used for validation.
The following are the valid parameters for specifying the validation method:
A. Primary metric; training data: This parameter specifies the primary metric to optimize during model training and the data used for training the model.
B. Primary metric; training data; validation data: This parameter specifies the primary metric to optimize during model training, the data used for training the model, and the data used for validating the model.
C. Primary metric; validation data; number of cross-validations: This parameter specifies the primary metric to optimize during model training, the data used for validating the model, and the number of times to perform cross-validation.
D. Primary metric; training data; validation set size: This parameter specifies the primary metric to optimize during model training, the data used for training the model, and the size of the validation set.
E. Primary metric; training data; validation data; validation set size: This parameter specifies the primary metric to optimize during model training, the data used for training the model, the data used for validating the model, and the size of the validation set.
Option E is a valid parameter for specifying the validation method, as it includes all the required parameters to validate the model.
Options A and C are not valid sets of parameters for specifying the validation method. Option A only specifies the primary metric and the training data but does not include any validation data. Option C includes the primary metric and the validation data but does not include any training data, which is required to train the model.
Therefore, options A and C are the two invalid sets of parameters for specifying the validation method.