HikeHills - AWS Certified Big Data - Specialty Exam Question

AWS Certified Big Data - Specialty Exam Question

Question

HikeHills.com (HH) is an online specialty retailer that sells clothing and outdoor refreshment gear for trekking, go camping, boulevard biking, mountain biking, rock hiking, ice mountaineering, skiing, avalanche protection, snowboarding, fly fishing, kayaking, rafting, road and trace running, and many more. HHruns their entire online infrastructure on java based web applications running on AWS.

The HH is capturing click stream data and use custom-build recommendation engine to recommend products which eventually improve sales, understand customer preferences and already using AWS kinesis KPL to collect events and transaction logs and process the stream.

The event/log size is around 12 bytes. HHhas the following requirements to process the data that is being ingested to support their enterprise search built on Elasticsearch- Load the data (syslog and transformed data) into ES Stream Capture transformation and delivery failures into same S3 bucket to address audit Backup the syslog streaming data into S3 bucket Select 3 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F.

Answer: A, C, E.

Option A is correct - For Amazon ES destinations, streaming data is delivered to your Amazon ES cluster, and it can optionally be backed up to your S3 bucketconcurrently.

https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams

Option B is incorrect - For Amazon ES destinations, streaming data is delivered to your Amazon ES cluster, and it can optionally be backed up to your S3 bucket concurrently.

https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams

Option C is correct -when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket and errors are delivered to processing-failed and errors folder in S3 bucket

https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html#retry

Option D is incorrect -when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket and errors are delivered to processing-failed and errors folder in S3 bucket

https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html#retry

Option E is correct -when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/create-

Option F is incorrect - when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/create-

The scenario describes that HikeHills.com (HH) captures clickstream data and transaction logs using AWS Kinesis KPL, and they want to process this data to support their enterprise search built on Elasticsearch. The event/log size is around 12 bytes. Additionally, HH wants to capture transformation and delivery failures into the same S3 bucket to address the audit and backup syslog streaming data into S3 bucket.

Given these requirements, let's analyze the answer options:

A. Streaming data can directly be delivered into Elasticsearch Domain - This option is incorrect as it doesn't address the requirement to capture transformation and delivery failures into the same S3 bucket to address audit and backup syslog streaming data into an S3 bucket.

B. Streaming data is delivered to your S3 bucket first. Kinesis Data Firehose then issues an Amazon Elasticsearch COPY command to load data from your S3 bucket to your Amazon Elasticsearch cluster - This option is correct. When data is ingested into Kinesis, it can be delivered to an S3 bucket first using Kinesis Data Firehose. Kinesis Data Firehose can then issue an Amazon Elasticsearch COPY command to load data from the S3 bucket to the Elasticsearch cluster. This approach fulfills the requirement to backup syslog streaming data into an S3 bucket and supports transformation and delivery failure handling by storing them in the same S3 bucket.

C. The transformation failures and delivery failures are loaded into processing-failed and errors folders in the same S3 bucket - This option is incorrect as the specified folder names don't match the scenario requirements.

D. The transformation failures and delivery failures are loaded into transform-failed and delivery-failed folders in the same S3 bucket - This option is correct. The transformation failures and delivery failures can be stored in separate folders in the same S3 bucket. This approach fulfills the requirement to capture transformation and delivery failures into the same S3 bucket to address the audit.

E. When ES is selected as a destination, and Source record S3 backup is enabled, and Backup S3 Bucket is defined, untransformed incoming data can be delivered to a separate S3 bucket - This option is incorrect as it describes delivering untransformed data to a separate S3 bucket, which is not a requirement mentioned in the scenario.

F. S3 backups can be managed to bucket policies - This option is incorrect as it doesn't address any of the requirements mentioned in the scenario.

Therefore, the correct answers are B and D.