AWS DevOps: Simplifying and Accelerating Operations Debugging

Simplifying and Accelerating Operations Debugging for SaaS Companies | AWS DevOps

Prev Question Next Question

Question

You are hired as the new head of operations for a SaaS company.

Your CTO has asked you to make debugging any part of your entire operation simpler and as fast as possible.

She complains that she has no idea what is going on in the complex, service-oriented architecture because the developers log to disk.

It's tough to find errors in logs on so many services.

How can you best meet this requirement and satisfy your CTO?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer - D.

Amazon Elasticsearch Service makes it easy to deploy, operate, and scale Elasticsearch for log analytics, full-text search, application monitoring, and more.

Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch's easy-to-use APIs and real-time capabilities along with the availability, scalability, and security required by production workloads.

The service offers built-in integrations with Kibana, Logstash, and AWS services, including Amazon Kinesis Firehose, AWS Lambda, and Amazon CloudWatch so that you can go from raw data to actionable insights quickly.

For more information on Elastic Search, please refer to the below link:

https://aws.amazon.com/elasticsearch-service/

The requirement is to simplify and speed up debugging for the complex, service-oriented architecture that has a large number of services logging to disk, making it difficult to find errors in logs. The best approach to meet this requirement and satisfy the CTO is to use a centralized logging solution that can collect, analyze, and flag issues across all services.

Option A: This option involves copying all log files into AWS S3 using a cron job on each instance. An S3 notification configuration is used on the PutBucket event to publish events to AWS Lambda, which analyzes the logs as soon as they come in and flags issues. This approach is suitable for small-scale logging but may not be scalable for larger environments with a high volume of logs.

Option B: This option involves using CloudWatch Logs on every service and streaming all Log Groups into S3 objects. Adhoc MapReduce analysis is performed using AWS EMR cluster jobs, and new queries are written when needed. This approach provides scalability and can handle a large volume of logs. However, it requires significant expertise in MapReduce and may not be the most efficient solution for real-time monitoring.

Option C: This option involves copying all log files into AWS S3 using a cron job on each instance. An S3 notification configuration is used on the PutBucket event to publish events to AWS Kinesis, which is a scalable and fully managed service for real-time processing of streaming data. Apache Spark on AWS EMR is used to perform at-scale stream processing queries on the log chunks and flag issues. This approach is suitable for large-scale logging and can provide real-time monitoring and analysis.

Option D: This option involves using CloudWatch Logs on every service and streaming all Log Groups into an AWS Elasticsearch Service Domain running Kibana 4. Log analysis is performed on a search cluster. This approach provides scalability and real-time monitoring but requires expertise in Elasticsearch and Kibana.

Based on the requirements and the pros and cons of each option, Option C is the most suitable solution for the SaaS company. It provides scalability, real-time monitoring, and can handle large-scale logging. It also uses fully managed services such as AWS Kinesis and Apache Spark on AWS EMR, reducing the need for significant expertise in specific technologies.