Kinesis Data Analytics SQL - Redesigning Time-Based Windowed Queries

Understanding TIMESTAMP for In-Application Stream Processing

Question

Kinesis Data Analytics SQL is configured to read a kinesis stream processing events that is generated by integration of data generated by the IOT devices.The support team wants to use the TIMESTAMP when the data was added to the in-application stream and redesign their time-based windowed queries. Which of the below options provides relevant information? Select 2 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer : A, D.

Option A is correct - ROWTIME stores a timestamp when Amazon Kinesis Data Analytics inserts a row in the first in-application stream.

ROWTIME reflects the timestamp at which Amazon Kinesis Data Analytics inserted a record into the first in-application stream after reading from the streaming source.

This ROWTIME value is then maintained throughout your application.

Amazon Kinesis Data Analytics guarantees that the ROWTIME values are monotonically increased.

You use this timestamp in time-based windowed queries.

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/timestamps-rowtime-concepts.html

Option B is incorrect - INGEST TIME is the timestamp of when record was added to the streaming source.

Amazon Kinesis Data Streams includes a field called APPROXIMATE_ARRIVAL_TIME in every record that provides this timestamp.

This is also sometimes referred to as the server-side time.

This ingest time is often the close approximation of event time.

the ingest time is rarely out of order, but it can occur due to the distributed nature of streaming data.

Therefore, Ingest time is a mostly accurate and in-order reflection of the event time.

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/timestamps-rowtime-concepts.html

Option C is incorrect - EVENT TIME is the timestamp when the event occurred.

This is also sometimes called the client-side time.

It is often desirable to use this time in analytics because it is the time when an event occurred.

However, many event sources, such as mobile phones and web clients, do not have reliable clocks, which can lead to inaccurate times.

In addition, connectivity issues can lead to records appearing on a stream not in the same order the events occurred.

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/timestamps-rowtime-concepts.html

Option D is correct - PROCESSING TIME is the timestamp when Amazon Kinesis Data Analytics inserts a row in the first in-application stream.

Amazon Kinesis Data Analytics provides this timestamp in the ROWTIME column that exists in each in-application stream.

The processing time is always monotonically increasing.

But it will not be accurate if your application falls behind.

(If an application falls behind, the processing time does not accurately reflect the event time.) This ROWTIME is accurate in relation to the wall clock, but it might not be the time when the event actually occurred.

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/timestamps-rowtime-concepts.html

Kinesis Data Analytics SQL is a managed service by AWS that allows developers to process and analyze real-time streaming data using standard SQL queries. In the given scenario, the Kinesis Data Analytics SQL is processing events generated by IoT devices, and the support team wants to use the TIMESTAMP when the data was added to the in-application stream to redesign their time-based windowed queries.

There are four types of timestamps that Kinesis Data Analytics SQL provides to process and analyze streaming data, which are:

  1. ROWTIME: This is the time when a row was created in the in-application stream, and it is equivalent to the time when the event occurred. ROWTIME is set by Kinesis Data Analytics SQL and cannot be modified by the user.

  2. INGEST TIME: This is the time when an event was ingested into the Kinesis stream. INGEST TIME is set by Kinesis Data Streams and cannot be modified by the user.

  3. EVENT TIME: This is the time when an event occurred or was generated by a device or system. EVENT TIME can be set by the user by including a timestamp field in the event data.

  4. PROCESSING TIME: This is the time when an event is processed by Kinesis Data Analytics SQL. PROCESSING TIME is set by Kinesis Data Analytics SQL and cannot be modified by the user.

In the given scenario, the support team wants to use the TIMESTAMP when the data was added to the in-application stream, which is equivalent to ROWTIME. Therefore, option A, ROWTIME, provides relevant information.

Option B, INGEST TIME, is not relevant because it represents the time when the event was ingested into the Kinesis stream, which may not be the same as when the event occurred.

Option C, EVENT TIME, may be relevant if the user has included a timestamp field in the event data, but the scenario does not mention this.

Option D, PROCESSING TIME, is not relevant because it represents the time when an event is processed by Kinesis Data Analytics SQL, which may not be the same as when the event occurred or was added to the in-application stream.