Resharding Behavior of Kinesis Stream:

Resharding Behavior

Question

HikeHills.com (HH) is an online specialty retailer that sells clothing and outdoor refreshment gear for trekking, go camping, boulevard biking, mountain biking, rock hiking, ice mountaineering, skiing, avalanche protection, snowboarding, fly fishing, kayaking, rafting, road and trace running, and many more. HH runs their entire online infrastructure on java based web applications running on AWS.

The HH is capturing click stream data and use custom-build recommendation engine to recommend products which eventually improve sales, understand customer preferences and already using AWS Kinesis Producer Library to collect events and transaction logs and process the stream. HH IT team identified lot of performance issues with the Kinesis Stream and based on the metrics captured, identified hot and cold shards.

IT team wants to perform Resharding.

There are 2 shards SHARD 1 with a hash key range of 276...381, SHARD 2 with a hash key range of 382...454

Post re-sharding how does Kinesis Stream behave?Select 5 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F. G.

Answer: A, C, D, E, H.

Option A is correct - Data records that were flowing to the parent shards are re-routed to flow to the child shards based on the hash key values that the data-record partition keys map to.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option B is incorrect - The parent shards does not disappear when the reshard occurs.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option C is correct -data records that were in the parent shards before the reshard remain in those shards.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option D is correct -Before a reshard operation, a parent shard is in the OPEN state.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option E is correct - After a reshard operation, the parent shard transitions to a CLOSED state.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option F is incorrect - After a reshard operation, the parent shard will be in CLOSED state.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option G is incorrect - After a reshard operation, the child shard will be in OPEN state.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Option H is correct - After the stream's retention period has expired, parent SHARD moves to EXPIRED state.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html

Sure, I'd be happy to explain what happens to a Kinesis Stream when resharding is performed.

Kinesis Streams is a distributed, scalable, and fault-tolerant data stream that allows you to capture and process large amounts of data in real-time. Sharding is the process of dividing a stream into smaller, more manageable parts called shards, each of which can process a specific amount of data records. Resharding is the process of adding or removing shards from a stream.

When a resharding operation is performed on a Kinesis Stream, the following happens:

  1. The Kinesis Stream is temporarily put into the UPDATING state, during which time no data records can be written to or read from the stream.

  2. The new shards are created based on the configuration specified during the resharding operation. In this case, let's assume that the two parent shards, SHARD 1 and SHARD 2, are being split into four child shards.

  3. The parent shards, SHARD 1 and SHARD 2, enter the UPDATING state.

  4. Data records that were flowing to the parent shards are re-routed to flow to the child shards based on the hash key values that the data-record partition keys map to. This means that each child shard will receive a subset of the data records that were previously being sent to the parent shards.

  5. The parent shards, SHARD 1 and SHARD 2, remain in the OPEN state during the resharding operation.

  6. The child shards are initially created in the CLOSED state. This means that no data records can be written to or read from the child shards until they are explicitly opened.

  7. Once the child shards have been fully created and data records have been re-routed to them, the parent shards are split into two new pairs of parent and child shards. The new parent shards will each have a hash key range that is a subset of the hash key range of the original parent shards.

  8. The old parent shards, SHARD 1 and SHARD 2, disappear when the resharding operation is complete. All the records in the old parent shards are moved to the child shards.

  9. After the resharding operation is complete, the child shards will need to be manually opened before data records can be written to or read from them.

  10. Finally, it's worth noting that after a stream's retention period has expired, parent shards move to the EXPIRED state. This means that no data records can be written to or read from the expired shards, and they can be safely deleted.