Google Cloud DevOps: Ensuring Service Level Objectives (SLOs) in Production

Ensuring Service Level Objectives (SLOs) in Production

Question

You are part of an organization that follows SRE practices and principles.

You are taking over the management of a new service from the Development Team, and you conduct a Production Readiness Review (PRR)

After the PRR analysis phase, you determine that the service cannot currently meet its Service Level Objectives (SLOs)

You want to ensure that the service can meet its SLOs in production.

What should you do next?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

B.

As an SRE, the ultimate goal is to ensure that the service being provided meets the Service Level Objectives (SLOs) agreed upon by the organization. Therefore, if a service cannot currently meet its SLOs, the next step should be to take action to bring it up to the required level of reliability.

Option A, adjusting the SLO targets to be achievable by the service so you can bring it into production, is not recommended as it compromises the quality of service and reliability that the organization is committed to providing.

Option B, notifying the development team that they will have to provide production support for the service, is also not the best solution as it creates additional workload and responsibilities for the development team, which may lead to a suboptimal outcome.

Option D, bringing the service into production with no SLOs and building them when you have collected operational data, is not the recommended approach either because it does not guarantee that the service will meet the desired level of reliability.

The best option is C, identifying recommended reliability improvements to the service to be completed before handover. This involves working with the development team to identify the root causes of the service's inability to meet its SLOs and proposing concrete measures that can be taken to improve reliability. The improvements may involve changes to the service's architecture, infrastructure, monitoring, or testing practices. Once the improvements have been implemented and tested, the service can be handed over to production, confident that it will meet its SLOs.

In summary, as an SRE, it is essential to prioritize the reliability of the service being provided, and if it cannot meet its SLOs, work with the development team to identify and implement the necessary improvements to achieve the desired level of reliability.