Building ML Pipelines for Image Classification with Azure: A DP-100 Exam Guide

Designing ML Pipelines for Image Classification using Azure

Question

Your team is tasked with building a ML solution to classify and label a large amount of images.

The images are stored as files in an Azure blob storage and the model to be used is a pre-trained, ready-to-use neural network.

You decide to make use of the ML pipelines which are powerful tools for building automated workflows.

You jump into the high level design of a pipeline which fits your goal.

The main building blocks you can choose from: Register blob_storage as a Datastore Register blob_storage as a Dataset Register the model to your Workspace Register the model to your Pipeline Create the Pipeline Register image files as a Datastore Register image files as Dataset Create Pipeline steps Which of the above blocks should you use?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: C.

Option A is incorrect becausethe source container of data (the blob storage) must be registered as a Datastore, hence “Dataset” is incorrect in this context.

In addition, the model has to be registered to the ML Workspace (not to the Pipeline).

Option B is incorrect because the source container of data (the blob storage) must be registered as a Datastore, hence “Dataset” is incorrect in this context.

Option C is CORRECT because in order to build the required ML pipeline, you need to register source storage as a datastore; register the image files as a Dataset; register your pre-trained model to your ML workspace; define the necessary Steps for the pipeline; and, finally, you have to build the Pipeline object.

Option D is incorrect because the image files to be processed will serve as a Dataset for the pipeline, therefore trying to use them as a “Datastore” is incorrect.

Reference:

To build an ML solution for classifying and labeling a large amount of images stored in Azure blob storage using a pre-trained, ready-to-use neural network, we need to make use of the ML pipelines provided by Azure. The main building blocks that we can choose from are:

  1. Register blob_storage as a Datastore: This block is not required as we already know that the images are stored in Azure blob storage.

  2. Register blob_storage as a Dataset: This block is required to access the images stored in Azure blob storage and create a dataset that can be used to train the model.

  3. Register the model to your Workspace: This block is required to register the pre-trained, ready-to-use neural network model to our Azure ML Workspace.

  4. Register the model to your Pipeline: This block is not required as we have already registered the model to our Azure ML Workspace.

  5. Create the Pipeline: This block is required to create the ML pipeline that can be used to train the model and classify the images.

  6. Register image files as a Datastore: This block is not required as we have already registered the Azure blob storage as a Datastore.

  7. Register image files as Dataset: This block is required to create a dataset from the images stored in Azure blob storage that can be used to train the model.

  8. Create Pipeline steps: This block is required to define the steps in the ML pipeline, such as data preprocessing, model training, and image classification.

Therefore, the correct answer is option B. We need to use the following blocks: Register blob_storage as a Dataset, Register the model to your Workspace, Create the Pipeline, Register image files as a Dataset, and Create Pipeline steps.