Tick-Bank Web Traffic Analytics Solution

Web Traffic Analytics Solution for Tick-Bank

Question

Tick-Bank is a privately held Internet retailer of both physical and digital products founded in 2008

The company has more than six-million clients worldwide.

Tick-Bank aims to serve as a connection between digital content makers and affiliate dealers, who then promote them to clients.

Tick-Bank's technology aids in payments, tax calculations and a variety of customer service tasks.

Tick-Bank assists in building perceptibility and revenue making opportunities for entrepreneurs. Tick-Bank runs multiple java based web applications running on windows based EC2 machines in AWS managed by internal IT Java team, to serve various business functions.

Tick-Bank is looking to enable web-site traffic analytics there by understanding user navigational behavior, preferences and other click related info.

The amount of data captured per click is in tens of bytes.

Tick-Bank has the following objectives in mind for the solution. Optimize the overall platform costs by accumulating user records thereby improve throughput to the stream Minimal changes to the web application by embedding simple code Integrate seamlessly to de-aggregate batched records on the consumer Long term storage into S3 storage bucket for future integration with the BI application Select 2 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

Answer : A, D.

Option A is correct - Kinesis Producer Library (KPL) acts as an intermediary between your application and the Kinesis Data Streams API.KPL simplifies producer application development and also building batch of aggregation of user records by increasing payload size and improve throughput and optimize costs.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-concepts.html

KCL seamlessly integrate with Data streams to de-aggregate batched records on the consumer.

https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-kcl.html

Option B is incorrect - Kinesis Data Streams does not inspect, interpret, or change the data in any way.

Each record also has an associated sequence number and partition key.

Aggregation is not allowed.

Besides each user record is a stream record and that improves throughput of the HTTP request but the overall platform costs cannot be optimized.

https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-sdk.html

Option C is incorrect - The purpose of Kinesis agent is different.

Kinesis Agent is a stand-alone Java application that can easily collect and send data to Kinesis Data Streams.

The agent can continuously monitor set of files (more for log files) and Aggregation of data is not possible.

https://docs.aws.amazon.com/streams/latest/dev/writing-with-agents.html

Option D is correct - Kinesis connector library helps java developers integrate Kinesis Streams with other AWS services.

The library provides connectors to various AWS services including S3

Each Amazon Kinesis connector application is a pipeline that understands how records from Kinesis Stream will be handled.

https://github.com/awslabs/amazon-kinesis-connectors

Option E is incorrect - Data Streams API can be consumer to the Data Stream but cannot de-aggregate the data.

Also Streams API again needs kinesis connectors to connect to different sources.

https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-sdk.html

Tick-Bank is looking to enable web-site traffic analytics by capturing user navigational behavior, preferences, and click-related information. The amount of data captured per click is in tens of bytes. Tick-Bank has set the following objectives for the solution:

  1. Optimize the overall platform costs by accumulating user records, thereby improving throughput to the stream.
  2. Minimal changes to the web application by embedding simple code.
  3. Integrate seamlessly to de-aggregate batched records on the consumer.
  4. Long-term storage into S3 storage bucket for future integration with the BI application.

To achieve these objectives, Tick-Bank can choose from various AWS services that can help with data collection, processing, and storage. Let's evaluate the options provided and determine the best ones.

Option A: Use Kinesis Producer Library (KPL) microservices to make necessary code changes, aggregation of user records through batching, thereby increasing throughput before stream record is processed, and use Kinesis Client Library (KCL) to de-aggregate batched records from the stream.

This option involves using Kinesis Producer Library (KPL) to aggregate the data before it is sent to Kinesis Data Streams. By batching the data, the throughput can be improved, which can help optimize platform costs. Kinesis Client Library (KCL) can be used to de-aggregate the batched records on the consumer side. KPL and KCL both provide Java libraries that can be used to integrate with the existing Java-based web applications. This option satisfies all of the objectives set by Tick-Bank.

Option B: Use Kinesis Data Stream API with AWS SDK for Java to aggregate using PutRecords and de-aggregate batched records using GetRecords.

This option involves using the Kinesis Data Streams API with AWS SDK for Java to aggregate data using PutRecords and de-aggregate batched records using GetRecords. While this option can also be used to achieve the objectives set by Tick-Bank, it requires more coding effort compared to Option A.

Option C: Use Kinesis agent to pre-process, aggregate, batch the clickstream data and use Kinesis Data Streams to de-aggregate batched records.

This option involves using Kinesis agent to pre-process, aggregate, and batch the clickstream data before it is sent to Kinesis Data Streams. However, this option is not suitable for Tick-Bank as it requires changes to the web application, which goes against the second objective of minimal changes to the web application.

Option D: Use KCL uses Kinesis connector library to de-aggregate and write data to S3.

This option involves using Kinesis Connector Library to de-aggregate the data and write it to an S3 bucket. However, this option does not satisfy the first objective of accumulating user records to improve throughput before stream record is processed.

Option E: Use Data Streams API to de-aggregate and write data to S3.

This option involves using the Data Streams API to de-aggregate and write data to an S3 bucket. However, this option does not satisfy the first objective of accumulating user records to improve throughput before stream record is processed.

Therefore, Option A is the best choice for Tick-Bank as it satisfies all the objectives and requires minimal changes to the web application.