Database Failover Risk with New System

Database Failover Risk with New System

Question

Your product is currently deployed in three Google Cloud Platform (GCP) zones with your users divided between the zones.

You can fail over from one zone to another, but it causes a 10-minute service disruption for the affected users.

You typically experience a database failure once per quarter and can detect it within five minutes.

You are cataloging the reliability risks of a new real-time chat feature for your product.

You catalog the following information for each risk: * Mean Time to Detect (MTTD) in minutes * Mean Time to Repair (MTTR) in minutes * Mean Time Between Failure (MTBF) in days * User Impact Percentage The chat feature requires a new database system that takes twice as long to successfully fail over between zones.

You want to account for the risk of the new database failing in one zone.

What would be the values for the risk of database failover with the new system?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

C.

To calculate the values for the risk of database failover with the new system, we need to consider the following factors:

  1. Mean Time to Detect (MTTD): This is the amount of time it takes to detect that a failure has occurred. In this scenario, we are given that the database failure can be detected within five minutes. Therefore, the MTTD would be 5 minutes.

  2. Mean Time to Repair (MTTR): This is the amount of time it takes to fix the issue once it has been detected. We are told that failover between zones takes 10 minutes and that the new database system takes twice as long to failover. Therefore, the MTTR would be 20 minutes (2 x 10 minutes).

  3. Mean Time Between Failure (MTBF): This is the amount of time between each failure of the system. We are given that the database failure occurs once per quarter, which is 90 days. Therefore, the MTBF would be 90 days.

  4. User Impact Percentage: This is the percentage of users who will be affected by the failure. We are told that failover from one zone to another causes a 10-minute service disruption for the affected users. Therefore, the user impact percentage would be 33%.

Based on the above factors, we can calculate the values for the risk of database failover with the new system:

Option A: MTTD: 5 MTTR: 10 MTBF: 90 Impact: 33%

  • This option assumes that the MTTR remains the same as the current system, even though we are told that failover with the new database system takes twice as long. Therefore, this option is not correct.

Option B: MTTD: 5 MTTR: 20 MTBF: 90 Impact: 33%

  • This option correctly accounts for the longer failover time of the new database system. Therefore, this option is correct.

Option C: MTTD: 5 MTTR: 10 MTBF: 90 Impact: 50%

  • This option assumes a higher user impact percentage than what is given in the scenario. Therefore, this option is not correct.

Option D: MTTD: 5 MTTR: 20 MTBF: 90 Impact: 50%

  • This option assumes a higher user impact percentage than what is given in the scenario. Therefore, this option is not correct.

Therefore, the correct answer is Option B: MTTD: 5 MTTR: 20 MTBF: 90 Impact: 33%.