You work as a machine learning specialist for a sports gambling website.
Your machine learning team has been asked to create a football score prediction model that predicts the winner of a match, the score difference, and the shots-on-goal differential.
You have collected historical football match data, and you have selected the SageMaker XGBoost built-in algorithm to use for your model.
You are now ready to train your model using a SageMaker training job.
Which of the following are NOT used by your SageMaker training job? (Select TWO)
Click on the arrows to vote for the correct answer
A. B. C. D. E.Correct Answers: C and D.
Option A is incorrect.
You must supply the URL of the S3 bucket where you have stored your training data to run your training job.
Option B is incorrect.
You must supply the ML compute instances that SageMaker will use for model training.
Option C is correct.
Your training code is referenced by an ECR path to your training code, not an ECS path.
Option D is correct.
Your training code is referenced by an ECR path to your training code, not the URL of an S3 bucket.
Option E is incorrect.
You must supply the ECR path where the training code is stored for your training job to use.
References:
Please see the Amazon SageMaker developer guide titled Train a Model with Amazon SageMaker (https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-training.html),
The Amazon SageMaker developer guide titled Docker Registry Paths for SageMaker Built-in Algorithms (https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html)
The SageMaker XGBoost algorithm is a built-in algorithm provided by Amazon SageMaker that is designed to handle regression and classification problems. It is particularly suitable for large-scale, distributed training jobs and can be used with structured data.
To train the model, a SageMaker training job will be created. During this process, Amazon SageMaker will handle many of the underlying infrastructure and orchestration tasks, such as creating and managing the compute instances used for training and retrieving data from an S3 bucket.
However, there are certain aspects of the training job that are not used, as specified in the question. Let's go through each of the options and determine which are not used by the SageMaker training job:
A. URL of the S3 bucket where you have stored your training data This option is used by the SageMaker training job because the training data needs to be accessed from an S3 bucket during the training process.
B. SageMaker managed ML compute instances that SageMaker will use for model training This option is used by the SageMaker training job because Amazon SageMaker creates and manages the compute instances used for training.
C. ECS path where the training code is stored This option is not used by the SageMaker training job. ECS stands for Elastic Container Service, which is a container management service provided by AWS. However, it is not necessary to use ECS to run a SageMaker training job, and the training code is not stored in an ECS path.
D. URL of the S3 bucket where you have stored your training code This option is not used by the SageMaker training job. The training code can be stored in an S3 bucket, but the URL of the bucket is not required for the training job.
E. ECR path where the training code is stored. This option is not used by the SageMaker training job. ECR stands for Elastic Container Registry, which is a container registry service provided by AWS. However, it is not necessary to use ECR to run a SageMaker training job, and the training code is not stored in an ECR path.
Therefore, the options that are not used by the SageMaker training job are C and D.