You are the new IT architect in a company that operates a mobile sleep tracking application.
When activated at night, the mobile app sends collected data points of 100 kilobyte every 5 minutes to your backend.
The backend takes care of authenticating the user and writing the data points into an Amazon DynamoDB table.
Every morning, you scan the table to extract and aggregate last night's data on a per-user basis and store the results in Amazon S3
Users are notified via Amazon SNS mobile push notifications that new data is available, which is parsed and visualized by the mobile app.
Currently, you have around 100k users who are mostly based out of North America.
You have been tasked to optimize the architecture of the backend system to lower costs.
Click on the arrows to vote for the correct answer
A. B. C. D. E.Answers - C & E.
Option A is incorrect because accessing the DynamoDB table for read and write by 100k users will exhaust the read and write capacity, which will increase the cost drastically.
Option B is incorrect because creating clusters of EC2 instances will be a very expensive solution in this scenario.
Option C is CORRECT because (a) with SQS, the huge number of writes overnight will be buffered/queued which will avoid exhausting the write capacity (hence, cutting down on cost), and (b) SQS can handle a sudden high load, if any.
Option D is incorrect because the data is not directly accessed from the DynamoDB table by the users.
It is accessed from S3
So, there is no need for caching.
Since the results are stored in S3, introducing ElastiCache is unnecessary.
Option E is CORRECT because once the aggregated data is stored on S3, there is no point in keeping the DynamoDB tables pertaining to the previous days.
Keeping the tables for the latest data only will certainly cut the unnecessary costs, keeping the overall cost of the solution down.
The optimal solution for optimizing the architecture of the backend system to lower costs is to introduce an Amazon SQS queue to buffer writes to the Amazon DynamoDB table and reduce provisioned write throughput.
Option A, having the mobile app access Amazon DynamoDB directly instead of JSON files stored on Amazon S3, would not be cost-effective, as it would require a high provisioned throughput for writes and reads to DynamoDB, which could become expensive as the number of users grows.
Option B, replacing both Amazon DynamoDB and Amazon S3 with an Amazon Redshift cluster, may not be the best solution because Redshift is more suitable for analytical workloads rather than transactional workloads.
Option D, introducing Amazon Elasticache to cache reads from the Amazon DynamoDB table and reduce provisioned read throughput, could reduce costs, but it would only help with the read throughput, not the write throughput.
Option E, creating a new Amazon DynamoDB table each day and dropping the one for the previous day after its data is on Amazon S3, may not be the best solution because it could lead to increased complexity in managing the data, as well as potential data loss if there are any issues in the process of dropping the table.
Therefore, the best option is C, introducing an Amazon SQS queue to buffer writes to the Amazon DynamoDB table and reduce provisioned write throughput. This would allow for a lower provisioned throughput for writes to DynamoDB, as the queue can buffer writes during peak times and process them at a lower rate during off-peak times. Additionally, it can help with managing any spikes in traffic and reduce the chances of write capacity being exceeded. This option can also reduce costs as the provisioned throughput for writes to DynamoDB can be reduced, resulting in lower costs.