For your machine learning experiments, you are going to use the Scikit-Learn framework.
You want to keep your Python code defining the run configuration as simple and compact as possible.
Which is the best way to achieve this goal?
Click on the arrows to vote for the correct answer
A. B. C. D.Answer: B.
Option A is incorrect because while this solution can be used to set the run configuration, in the case of Scikit-Learn framework, using the pre-configured SKLearn estimator is the best solution.
Option B is CORRECT because the simplest way to define the run configuration for the learning script built on a given ML framework (like Scikit-Learn) is to use the framework-specific estimators.
Option C is incorrect because while this can be used to set the run configuration, in the case of Scikit-Learn framework, using the pre-configured SKLearn estimator is the best solution.
Option D is incorrect because the specific ML packages (like ScikitLearn, PyTorch etc.) are not contained in the base configuration.
If you need Scikit-Learn, you have to add it to your run configuration (either via ScriptRunConfig or via an estimator).
Reference:
In order to keep the Python code defining the run configuration for machine learning experiments using Scikit-Learn as simple and compact as possible, we have the following options:
A. Use CondaDependencies.create(conda_packages=[scikit-learn
]...) to define the environment and use it as the environment_definition parameter of an Estimator:
This option creates a Conda environment with Scikit-Learn installed and uses it as the environment for running the experiment. This can be achieved using the CondaDependencies
class from the azureml.core.conda_dependencies
module. We can use this class to create a new environment with specific packages installed, such as Scikit-Learn, and then pass it to the Estimator
class as the environment_definition
parameter. This ensures that the environment used to run the experiment has Scikit-Learn installed, and we don't need to worry about installing it separately.
B. Import the SKLearn package and use the SKLearn pre-configured estimator to define the run configuration:
This option involves using the pre-configured estimator for Scikit-Learn provided by the azureml.train.sklearn
module. This estimator automatically sets up the environment with Scikit-Learn installed and takes care of other necessary configurations. We can simply import this estimator and use it to define the run configuration.
C. Import the Estimator package and use Estimator with parameter conda_packages=[scikit-learn
]:
This option is similar to option A, but instead of using the CondaDependencies
class to create the environment, we can simply pass the conda_packages
parameter to the Estimator
class with a list of packages to be installed in the environment. In this case, we would pass ['scikit-learn']
as the value for the conda_packages
parameter.
D. You don't need to set anything special because the Azure ML environments are pre-configured for the Scikit-Learn framework.
This option suggests that we don't need to set up anything special as the Azure ML environments are already pre-configured with Scikit-Learn. However, this option is not correct as the Azure ML environments do not come pre-configured with every package and library, and we need to specify the necessary dependencies.
In summary, the best option to achieve the goal of keeping the Python code defining the run configuration for Scikit-Learn experiments as simple and compact as possible is option B, which involves using the pre-configured estimator provided by the azureml.train.sklearn
module. However, options A and C are also valid alternatives that offer more control over the environment configuration. Option D is incorrect as it assumes the environments are already pre-configured with Scikit-Learn, which is not the case.