Purging Event Data in DynamoDB | Implementation Options

Automatically Purging Event Data in DynamoDB

Question

A developer is implementing an IoT application using DynamoDB as the data store for device event data.An application requirement is to purge all event data older than 30 days automatically.What is the optimal option to implement this requirement?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: A.

Option A is CORRECT because Time to Live (TTL) for Amazon DynamoDB is functionality that enables automatic deletion of items after a specified expiration time defined by a timestamp in the TTL attribute.

Option B is incorrect because this is not the optimal solution as it requires implementation and deployment for custom code inside a Lambda function.

Option C is incorrect because it would delete items newer than 30 days.

Option D is incorrect because this does not satisfy the original requirement.It would not purge data in the table.

Reference:

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/howitworks-ttl.html

The optimal option to implement the requirement of purging all event data older than 30 days automatically in an IoT application using DynamoDB as the data store for device event data is A. Enable TTL on the DynamoDB table and store the expiration timestamp in the TTL attribute in the epoch time format.

TTL (Time to Live) is a feature in DynamoDB that automatically removes expired items from a table. When you enable TTL on a table, you must specify a TTL attribute that contains an expiration timestamp for each item in the table. The TTL attribute must be a number representing the time in Unix epoch time format (the number of seconds since January 1, 1970, 00:00:00 UTC).

The benefits of using TTL to automatically delete expired items are:

  1. Low Cost: TTL is an automated process and is much cheaper compared to running a Lambda function or creating a new DynamoDB table every 30 days.

  2. Scalability: TTL is a built-in feature of DynamoDB and is designed to handle large volumes of data. It can handle tables with billions of items without impacting performance.

  3. Easy to Implement: Enabling TTL on a table is a simple process that requires only a few clicks in the AWS Management Console or a few lines of code in an SDK or CLI.

Option B, Implement a Lambda function to perform a query and delete on the table for items with a timestamp greater than 30 days, and use CloudWatch events to trigger the Lambda function is also a viable solution, but it requires more development effort and has higher costs than using TTL. The Lambda function would need to be scheduled to run at regular intervals to delete expired items, which would consume compute resources and incur charges for the Lambda function and CloudWatch events.

Option C, Create a new DynamoDB table every 30 days and delete the old DynamoDB table is not a practical solution as it would result in increased complexity and data management overhead. Creating a new table every 30 days would require updating the application code to point to the new table and migrating data from the old table to the new table.

Option D, Enable DynamoDB streams on the table, and implement a Lambda function to read events from the stream and delete expired items is not a recommended solution for this use case. Although DynamoDB streams can capture changes to items in a table in near real-time, it is not a reliable mechanism for deleting expired items. DynamoDB streams have a retention period of 24 hours, which means that changes older than 24 hours will not be available in the stream. Additionally, using DynamoDB streams to trigger a Lambda function to delete items can result in duplicate deletions if the Lambda function is triggered multiple times for the same item.