Troubleshooting Provisioned Throughput Errors with KCL and KPL for AWS Certified Big Data - Specialty Exam

Resolving Provisioned Throughput Errors with KCL and KPL

Question

Your development team has created separate applications which implement the KPL and KCL library for writing and reading data from Kinesis streams.

The KPL is being used to stream information from thousands of IoT devices.

The KCL application is consuming the records and providing real time analytics to the data science team.

After a dry run, the KCL based application is getting provisioned throughput errors.

Which of the following could should be carried out to resolve this issue?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer - C.

The AWS Documentation mentions the following.

If your Amazon Kinesis Data Streams application receives provisioned-throughput exceptions, you should increase the provisioned throughput for the DynamoDB table.

The KCL creates the table with a provisioned throughput of 10 reads per second and 10 writes per second, but this might not be sufficient for your application.

For example, if your Amazon Kinesis Data Streams application does frequent checkpointing or operates on a stream that is composed of many shards, you might need more throughput.

Options A and B are incorrect because the issue is related to a data consuming issue.

Option D is incorrect because this is not a security related issue.

For more information on KCL and DynamoDB, please refer to the below URL.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html

Note:

For each Amazon Kinesis Data Streams application, the KCL uses a unique Amazon DynamoDB table to keep track of the application's state.

For more detailed info, please read through below link:

https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html

In this scenario, the KCL application is consuming records from Kinesis streams and providing real-time analytics to the data science team. However, after a dry run, the KCL application is getting provisioned throughput errors. This issue can be resolved by taking one or more of the following steps:

A. Ensure the number of shards is increased: Increasing the number of shards will increase the capacity of the Kinesis stream, and hence, the throughput that can be processed by the KCL application. This is because each shard can handle a certain amount of data throughput. Therefore, increasing the number of shards will increase the total throughput capacity of the stream. However, increasing the number of shards will also increase the cost of running the Kinesis stream.

B. Increase the number of streams: Increasing the number of Kinesis streams can increase the throughput capacity of the Kinesis application. This is because each stream can handle a certain amount of data throughput. Therefore, increasing the number of streams will increase the total throughput capacity of the application. However, this will also increase the cost of running the Kinesis application.

C. Increase the throughput for DynamoDB tables: If the KCL application is using DynamoDB for checkpointing, then increasing the throughput of the DynamoDB table can help resolve the provisioned throughput errors. This is because checkpointing involves writing to a DynamoDB table, and if the table is not provisioned with sufficient throughput, then it can cause throughput errors.

D. Ensure the application has the right IAM Role attached: IAM Roles provide permission for AWS services to interact with each other, and if the KCL application does not have the right IAM Role attached, it may not be able to interact with the Kinesis stream or DynamoDB table correctly, leading to throughput errors. Therefore, ensuring that the application has the right IAM Role attached is necessary.

In conclusion, to resolve the provisioned throughput errors in the KCL application, we should take one or more of the following steps: increase the number of shards, increase the number of streams, increase the throughput for DynamoDB tables, and ensure the application has the right IAM Role attached.