Your team is developing a feature that is used to enhance the production usage rate.
A DynamoDB table has been implemented to trace relevant activities such as how the feature helps boosting the usage frequency.
The table has about 2k new items every day and keeps increasing.
Your PM wants to have a look at the daily data in this DynamoDB table.
How do you present the data to PM in a proper way?
Click on the arrows to vote for the correct answer
A. B. C. D.Option D is Correct:
The AWS Data Pipeline lets you automate the movement and processing of any amount of data using data-driven workflows and built-in dependency checking.
You can now choose between the following options for each pipeline that you build:
Run once.
Run a defined number of times.
Run on activation.
Run indefinitely.
Run repeatedly within a date range.
When a Data Pipeline is being created, you could add a schedule such as starting every day with a start time:
To end the pipeline, you could use occurrences or a specific date.
For example, the schedule will stop after 10 times or after 2 weeks.
Option A is incorrect: because in this case as the daily data is needed, the Data Pipeline should include a schedule to run otherwise it has to be run manually every day, which is not necessary.
Option B is incorrect: because you could only choose 100 items in a batch and export to a csv file.
Considering that the table has 2k data and keeps increasing.
This approach is not reasonable.
Option C is incorrect: because the end time is not used properly.
For the schedule in Data Pipeline, the end date/time is a date/time that the whole schedule stops such as one month later.
In this case, the end time of “8:00PM” is invalid and not required.
Instead, you could add an ending date in which the whole schedule stops as required.
Please also refer to https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-schedule.html.
on how to use the end time.
The most appropriate way to present the data to the PM is to create a data pipeline to transfer the data from DynamoDB table to an S3 bucket and schedule it to run every day at a specific time. Options A, C, and D provide this solution, but option B is not a suitable approach for handling data of this scale.
Option A is a valid approach to transfer data to an S3 bucket, but it requires the PM to manually open the files in S3 using an editor. This approach may not be optimal, especially if the PM wants to view the data frequently. Moreover, it doesn't specify a specific time range for the data, making it hard to differentiate between different daily data.
Option C specifies a start and end time for the data pipeline, which can be useful to limit the amount of data to be transferred to S3. However, it may not be optimal for presenting the data to the PM, as the PM may want to access the data at any time.
Option D is similar to Option A in terms of transferring data to S3, but it includes a schedule for activating the data pipeline. This approach is optimal as it schedules the data transfer at a specific time, ensuring that the PM has access to the data every day without having to perform any manual tasks. Additionally, the data is stored in an S3 bucket, which is a scalable, reliable, and secure storage solution provided by AWS.
Therefore, Option D is the most appropriate solution to present the data to the PM in a proper way.