Blue/Green Deployment Failure Recovery | AWS CloudFormation Stack Script

Recovering a Blue/Green Deployment Failure with AWS CloudFormation

Prev Question Next Question

Question

Your team starts to maintain a new application in AWS that uses a blue/green deployment.

All these AWS services are deployed in a CloudFormation stack.

The application has an ELB and two Auto Scaling Groups for the current release and the new release.

After the new ASG is attached to the ELB, there are tests to ensure that the new code has run smoothly.

However, there are possibilities that the tests have failed.

You need to use a script to implement this blue/green deployment.

Which of the below options would you consider to recover the system when the tests fail?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer - D.

For any script or code that manipulates the AWS resources, a key concept is that it should be auto-healing and be able to recover the system to the original status if anything goes wrong.

In other words, the script should be robust enough to ensure that the system can roll back automatically when an exception happens.

Option A is incorrect: SNS definitely helps as a notification method.

However, it does not help roll back the system so that it is not the most important thing to consider.

Option B is incorrect: Normally, the application should already generate troubleshooting logs.

In this question, it is not required to send logs to CloudWatch Logs.

Option C is incorrect: Similar reason as.

Option A.

This option can be regarded as a supplement.

Option D is CORRECT: Because after detaching the new ASG and attaching the old ASG, the system is rolled back to the previous state.

This is the most important step to recover the system when a rollback is required.

The correct answer to this question is D. Roll back the system to the original state by detaching the new ASG from ELB and attaching the old ASG in ELB.

Blue/green deployment is a technique used in DevOps to deploy software in a way that ensures there is no downtime or interruption to the end-users. In this deployment, the new version of the software is deployed alongside the old version, and the traffic is gradually shifted from the old version to the new version using a load balancer.

In this scenario, the application has an ELB and two Auto Scaling Groups for the current release and the new release. After the new ASG is attached to the ELB, there are tests to ensure that the new code has run smoothly. However, there are possibilities that the tests have failed.

To recover the system when the tests fail, the best option is to roll back the system to the original state by detaching the new ASG from ELB and attaching the old ASG in ELB. This is the safest option because it ensures that the end-users are not affected by the failed tests. The traffic will continue to be served by the old ASG until the issue is resolved.

Option A, using SNS to send a notification to your team to react to the failure in time, is not a good option because it does not address the issue at hand. It only notifies the team about the failure, but does not provide a solution.

Option B, collecting logs and sending them to AWS CloudWatch Logs, is a good option for troubleshooting the issue but does not provide a solution to the problem.

Option C, raising a CloudWatch alarm to alert the team when there is a failure, and creating a CloudWatch dashboard based on the Auto Scaling Group metrics, is a good option for monitoring the system but does not provide a solution to the problem.

In conclusion, the best option to recover the system when the tests fail is to roll back the system to the original state by detaching the new ASG from ELB and attaching the old ASG in ELB.