You work for an oil refinery company where you are on one of their machine learning teams.
Your team is responsible for building models that help the company decide where to place their exploratory drilling teams worldwide.
Your team lead has decided to build your model based on the K-Means built-in SageMaker algorithm.
The team lead has tasked you with providing metric visualization charts for the training runs of your team's model. How would you go about visualizing the training metrics? (Select TWO)
Click on the arrows to vote for the correct answer
A. B. C. D. E. F.Answers: C, F.
Option A is incorrect.
You use the SageMaker python module called sagemaker.analytics (not pandas.analytics) from which you import TrainingJobAnalytics (not TrainingAnalytics) to gain access to the python methods that allow you to visualize your metrics in charts.
Option B is incorrect.
You use the SageMaker python module called sagemaker.analytics from which you import TrainingJobAnalytics (not TrainingAnalytics) to gain access to the python methods that allow you to visualize your metrics in charts.
Option C is correct.You use the SageMaker python module called sagemaker.analytics from which you import TrainingJobAnalytics to gain access to the python methods that allow you to visualize your metrics in charts.
Option D is incorrect.
You use the SageMaker python module called sagemaker.analytics (not pandas.analytics) from which you import TrainingJobAnalytics to gain access to the python methods that allow you to visualize your metrics in charts.
Option E is incorrect.
To set the metric name that you wish to visualize, you need to give a valid metric for the algorithm you are training.
The test:cross_entropy metric is not valid for a K-Means training run.
Option F is correct.
To set the metric name that you wish to visualize, you need to give a valid metric for the algorithm you are training.
The test:msd metric is one of the two valid for a K-Means training run.
The other valid metric for K-Means is test:ssd.
Reference:
Please see the AWS Machine Learning Blog titled Easily monitor and visualize metrics while training models on Amazon SageMaker, and the Amazon SageMaker developer guide titled Tune a K-Means model.
To visualize the training metrics for the K-Means built-in SageMaker algorithm, we can use the following two options:
Option 1: Using sagemaker.analytics and TrainingJobAnalytics
TrainingJobAnalytics
module from the sagemaker.analytics
package.TrainingJobAnalytics
class to retrieve the training metrics from the training job.Code example:
pythonfrom sagemaker.analytics import TrainingJobAnalytics import matplotlib.pyplot as plt # Set the training job name and region training_job_name = '<training_job_name>' region = '<region>' # Create a TrainingJobAnalytics object for the given training job tja = TrainingJobAnalytics(training_job_name=training_job_name, region=region) # Get the training metrics data as a pandas DataFrame metrics_df = tja.dataframe() # Visualize the metrics plt.plot(metrics_df['timestamp'], metrics_df['train:accuracy'], label='train') plt.plot(metrics_df['timestamp'], metrics_df['test:accuracy'], label='test') plt.legend() plt.show()
Option 2: Using pandas.analytics and TrainingAnalytics
TrainingAnalytics
module from the pandas.analytics
package.TrainingAnalytics
class to retrieve the training metrics from the training job.Code example:
pythonfrom pandas.analytics.training import TrainingAnalytics import matplotlib.pyplot as plt # Set the training job name and region training_job_name = '<training_job_name>' region = '<region>' # Create a TrainingAnalytics object for the given training job ta = TrainingAnalytics(training_job_name=training_job_name, region=region) # Get the training metrics data as a pandas DataFrame metrics_df = ta.dataframe() # Visualize the metrics plt.plot(metrics_df['timestamp'], metrics_df['train:accuracy'], label='train') plt.plot(metrics_df['timestamp'], metrics_df['test:accuracy'], label='test') plt.legend() plt.show()
Note that in both options, we are setting the metric names to 'train:accuracy' and 'test:accuracy' to visualize the accuracy of the model during training. The answer options E and F are incorrect because they provide metric names that are not commonly used in the context of K-Means training.