HikeHills.com - Optimizing AWS Kinesis Stream for Improved Performance

Optimizing AWS Kinesis Stream for Improved Performance

Question

HikeHills.com (HH) is an online specialty retailer that sells clothing and outdoor refreshment gear for trekking, go camping, boulevard biking, mountain biking, rock hiking, ice mountaineering, skiing, avalanche protection, snowboarding, fly fishing, kayaking, rafting, road and trace running, and many more. HHruns their entire online infrastructure on java based web applications running on AWS.

The HH is capturing clickstream data and use custom-build recommendation engine to recommend products which eventually improve sales, understand customer preferences and already using AWS kinesis KPL to collect events and transaction logs and process the stream. HHIT team identified lot of performance issues with the Kinesis Stream and based on the metrics captured, identified hot and cold shards.

IT team wants to make better use of their unused capacity of the shards.

How can they achieve that? select 2 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: C,D.

Option A is incorrect -Hot shards or shards with more data is generally split to improve the performance of the stream

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html

Option B is incorrect -Hot shards or shards with more data is generally split to improve the performance of the stream

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html

Option C is correct -Merge the cold shards to merge cold shards to make better use of their unused capacity

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html

Option D is correct -Merge the shards that receive less data to merge cold shards to make better use of their unused capacity

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-resharding-strategies.html

Amazon Kinesis is a managed, scalable, and serverless streaming data service that enables organizations to collect, process, and analyze real-time, streaming data such as clickstream data, log data, and social media feeds. It can ingest large amounts of data per second and stream it in near real-time to various AWS services like S3, Redshift, Elasticsearch, and Lambda for further processing.

In this scenario, HikeHills.com (HH) is using Amazon Kinesis to collect and process clickstream data and transaction logs to feed their custom recommendation engine. However, their IT team has identified some performance issues with the Kinesis Stream and has found that there are hot and cold shards.

A shard is a basic unit of throughput in Amazon Kinesis, and it represents a single stream of data. Each shard has a certain amount of capacity to ingest data, and HH's IT team can view shard metrics to determine which shards are hot (receiving more data than they can handle) and which are cold (receiving less data than their capacity). By merging shards, the IT team can consolidate data streams and reduce the number of shards in the Kinesis Stream to optimize its performance.

There are two possible ways to merge shards to make better use of their unused capacity:

  1. Merge the hot shards: When a shard is hot, it means that it is receiving more data than its capacity. Merging hot shards will consolidate data streams and help to distribute the load across multiple shards, making better use of their unused capacity. This will also help to prevent data loss or throttling due to exceeding the maximum data throughput capacity.

  2. Merge the cold shards: When a shard is cold, it means that it is receiving less data than its capacity. Merging cold shards can help to optimize the Kinesis Stream's performance by consolidating data streams and freeing up unused capacity for other shards to use. This can also help to reduce costs by reducing the number of shards in the Kinesis Stream, which is a factor in determining the pricing of Amazon Kinesis.

Based on the options provided, the correct answers are A (Merge the hot shards to make better use of their unused capacity) and C (Merge the cold shards to merge cold shards to make better use of their unused capacity).

Option B (Merge the shards that receive more data to make better use of their unused capacity) is incorrect because it is unclear whether the high data throughput is causing performance issues, and merging these shards may exacerbate the problem.

Option D (Merge the shards that receive less data to merge cold shards to make better use of their unused capacity) is incorrect because merging cold shards may not improve performance if the underlying issue is with hot shards.