Create Real-Time Aggregated Time Series in DynamoDB | AWS Certified Big Data - Specialty Exam

Automated Storage of Real-Time Aggregated Time Series in DynamoDB

Question

A development team has been requested to create an application that will be used to ingest data.

The data consists of various metrics from various devices.

The requirement is to create an automated way to store real-time aggregated time series in DynamoDB.

How could you accomplish this?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer - A.

Amazon Kinesis Aggregators is a Java framework that enables the automatic creation of real-time aggregated time series data from Amazon Kinesis streams.

You can use this data to answer questions such as ‘how many times per second has ‘x' occurred' or ‘what was the breakdown by hour over the day of the streamed data containing ‘y'

Using this framework, you simply describe the format of the data on your stream (CSV, JSON, and so on), the granularity of times series that you require (seconds, minutes, hours, and so on), and how the data elements that are streamed should be grouped; the framework handles all the time series calculations and data persistence.

You then simply consume the time series aggregates in your application using Amazon DynamoDB, or interact with the time series using Amazon CloudWatch or the Web Query API.

Option B is incorrect since you need to use Kinesis here for ingestion of data.

Option C is incorrect since the ingestion of data should go as it is , and you need to aggregate data from the consumer side.

Option D is incorrect since the stream is just used to ingest the information.

For more information on the Kinesis aggregator library, please visit the url.

https://github.com/awslabs/amazon-kinesis-aggregators

To store real-time aggregated time series in DynamoDB, there are different ways that can be used such as:

A. Use the Amazon Kinesis Aggregators framework: The Amazon Kinesis Aggregators framework provides pre-built libraries that can aggregate data streams from Amazon Kinesis Data Streams into smaller, more manageable records that can be stored in Amazon DynamoDB or Amazon S3. This solution provides a fully managed, scalable, and reliable way to aggregate and store data in real-time.

B. Push the metrics to CloudWatch for aggregation: CloudWatch is a monitoring service that can collect and track metrics, collect and monitor log files, and set alarms. It can be used to store metrics for real-time aggregation, which can then be pushed to Amazon DynamoDB. However, CloudWatch is not designed for real-time data ingestion, and it can have latency issues.

C. Aggregate the metrics from the producer side: In this solution, the data can be pre-aggregated on the device or application that generates the data before it is ingested into DynamoDB. This method can reduce the amount of data that needs to be transferred and processed, but it requires additional processing logic on the producer side.

D. Aggregate the metrics in the stream itself: In this solution, the data can be ingested into a stream, and then an application can be used to aggregate the data as it is being processed in the stream. This method can reduce the amount of data that needs to be transferred and processed, and it can be useful if the data requires more processing logic than just aggregation.

In conclusion, the best option for storing real-time aggregated time series in DynamoDB would be to use the Amazon Kinesis Aggregators framework, as it is specifically designed for this purpose and provides a fully managed, scalable, and reliable way to aggregate and store data in real-time.