Configuring ML Environment and Building ML Pipeline for Data Science Solution on Azure | DP-100 Exam Preparation | Microsoft

ML Studio: Automatic Setup of ML Resources | DP-100 Exam Answer Explanation

Question

You are using the ML Studio for configuring your machine learning environment.

You need to ingest data from a web location and an ML pipeline needs to be built for preparing data and training your model.

Before you can run your first experiment, everything needs to be in place, every ML resource must be set up.

Which of the following actions don't you need to complete manually?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: D.

Option A is incorrect because the compute instance is a fully configured workstation within your ML workspace.

It is used to run the notebooks and which can also be used for smaller training workloads.

You need to create one, manually.

Option B is incorrect because creation of a machine learning workspace is the mandatory requirement for any machine learning experiment, pipeline etc.

This is the very first step in configuring an ML work environment.

Option C is incorrect because a training cluster is needed for running the training experiments.

It must be configured manually before running the pipeline.

Option D is CORRECT because as an associated resource, an Azure Blob storage is created automatically upon creation of the ML workspace.

You don't need to create one manually.

If your data resides in your company data stores (like in a Data Lake storage), you need to add them to the workspace, but the workspace has a blob store by default.

Reference:

In order to configure a machine learning environment in Azure ML Studio, you need to create various resources to prepare and train your model. The four resources mentioned in the question are:

A. Compute instance: A virtual machine used for interactive development and experimentation.

B. Workspace: A centralized location where you can manage your machine learning assets such as models, datasets, and pipelines.

C. Training cluster: A set of compute resources used to train a machine learning model at scale.

D. Datastore: A storage location for your data, where you can access and manage your data from within your workspace.

To ingest data from a web location, you need to create a datastore that can connect to the web location and store the data. You can create a new datastore in the Azure ML Studio interface by selecting the "Datastores" option from the left-hand menu and clicking "New". From there, you can choose the type of datastore you want to create and configure it to connect to the web location.

Once the datastore is created, you can use it in your ML pipeline for preparing and training your model. You can build an ML pipeline using the drag-and-drop interface in Azure ML Studio, selecting the appropriate modules for data preparation, training, and evaluation.

Before you can run your experiment, you need to create a workspace and a compute instance to run your pipeline. You can create a workspace by selecting "Workspaces" from the left-hand menu and clicking "New". Similarly, you can create a compute instance by selecting "Compute" from the left-hand menu and clicking "New".

The only resource that you do not need to create manually is the Training cluster. This is because the training cluster is automatically created when you run your experiment, based on the configuration you specified in your pipeline. You can choose the size and type of compute resources to use for the cluster, as well as the number of nodes.

In summary, the actions that you need to complete manually before running your experiment are creating a workspace and a compute instance, and creating a datastore for ingesting data from a web location. The training cluster is created automatically when you run your experiment based on the configuration you specified in your pipeline.