You are the new IT architect in a company that operates a mobile sleep tracking application.
When activated at night, the mobile app sends collected data points of 1 KB every 5 minutes to your middleware.
The middleware layer takes care of authenticating the user and writing the data points into an Amazon DynamoDB table.
Every morning, you scan the table to extract and aggregate last night's data on a per-user basis and store the results in Amazon S3
Users are notified via Amazon SMS mobile push notifications that new data is available, parsed, and visualized by the mobile app.
The old data is not required by the end-users.
Currently, you have around 100k users.
You have been tasked to optimize the architecture of the middleware system to lower the cost.
What would you recommend?
(Select TWO)
Click on the arrows to vote for the correct answer
A. B. C. D. E.Answer - A and C.
Option A is CORRECT because (a) The data stored would be old/obsolete anyways and need not be stored; hence, lowering the cost, and (b) Storing the data in DynamoDB is expensive.
Hence, you can set an expiry date so that the data gets deleted automatically.
Option B is incorrect because (a) Storing the data in DynamoDB is more expensive than S3, and (b) giving the app access to DynamoDB to read the data is an operational overhead.
Option C is CORRECT because (a) it uses SQS which reduces the provisioned output cutting down on the costs, and (b) acts as a buffer that absorbs sudden higher load, eliminating going over the provisioned capacity.
Option D is incorrect because the data is only read once before it is stored to S3
The cache would only be useful if you read things multiple times.
Also, in this scenario optimizing "write" operations is most desired, not "read" ones.
Option E is incorrect because (a) Amazon Redshift cluster is primarily used for OLAP transactions, not OLTP; hence, not suitable for this scenario, and (b) moving the storage to Redshift cluster means deploying a large number of EC2 instances that are continuously running, which is not a cost-effective solution.
For complete guidelines on working with DynamoDB, please visit the below URL-
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/howitworks-ttl.htmlThe goal is to optimize the middleware system of a mobile sleep tracking application that sends data points of 1KB every 5 minutes from the mobile app to an Amazon DynamoDB table via middleware layer, and then aggregates the data on a per-user basis and stores the results in Amazon S3 for visualization and notification purposes.
To lower the cost, the following two recommendations can be made:
Store the data in the DynamoDB table with a Time to Live (TTL) and the data will be deleted automatically (Option A). Using DynamoDB with a TTL feature will allow the automatic deletion of old data from the table. This way, users don't need to access or visualize old data that is not required. With TTL, there will be no additional cost for deleting old data, and the cost of storing the data will be minimized.
Introduce an Amazon SQS queue to buffer writes to the Amazon DynamoDB table and reduce provisioned write throughput (Option C). Using Amazon SQS can help to reduce provisioned write throughput by buffering the writes to the Amazon DynamoDB table. This way, the system can handle any spikes in data ingestion without the need for additional write capacity in DynamoDB. Additionally, by using an SQS queue, the middleware layer can ensure that every write operation is processed reliably, even in cases of high concurrency, data loss, or failure.
Other options such as accessing DynamoDB directly from the mobile app (Option B) or using Elasticache to cache reads from DynamoDB (Option D) may not necessarily optimize the cost since the first option can increase the load on DynamoDB, and the second option can increase the cost by introducing an additional service.
Lastly, writing data directly into an Amazon Redshift cluster (Option E) is not a recommended option since it is a different use case than DynamoDB, and the requirements for data aggregation and visualization would require significant changes to the application architecture. Additionally, Redshift is not suitable for real-time data ingestion or low-latency data access.
Therefore, options A and C are the recommended choices to optimize the middleware system's architecture and lower the cost.