Top Most Friendly User | KindleYou Mobile App | AWS Certified Big Data - Specialty

Retrieving the Top Most Friendly User in an Efficient Manner - KindleYou Mobile App

Question

KindleYou is a location-based social search mobile app that allows users to like or dislike other users, and allows users to chat if both parties liked each other in the app.

It has more than 1 billion customers across the world. They use DynamoDB to support the mobile application and S3 to host the images and other documents shared between users. The application tracks # of photos uploaded for a specific user real-time.

Which of the following option would help us to retrieve the top most friendly user in efficient manner?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: A.

Option A is correct -Add a Boolean attribute ‘Is_Top_Friendly' to the table and add a sparse index.

Sparse indexes are useful for queries over a small subsection of a table.

Option B is incorrect -Aggregation of data for maintaining near real-time aggregations and key metrics on top of rapidly changing data is becoming increasingly valuable to businesses for making rapid decisions.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-aggregation.html

Option C is incorrect -Overloading GSI only addresses adding different fields as to cater different queries.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-overloading.html

Option D is incorrect -To enable selective queries across the entire key space, you can use write sharding by adding an attribute containing a (0-N) value to every item that you will use for the global secondary index partition key.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-gsi-sharding.html

To retrieve the top most friendly user in an efficient manner, we need to identify the best approach for querying the data. The query should be optimized for performance, cost, and scalability. Based on the given scenario, we have the following options:

A. Using Sparse Indexes: Sparse indexes are designed to optimize queries on sparse data, where only a few items have a particular attribute. In this scenario, if we create a sparse index on the "likes" attribute, we can quickly retrieve the list of users who have been liked the most. However, since KindleYou has more than 1 billion customers worldwide, this approach may not be efficient in terms of cost and scalability. Sparse indexes can quickly become expensive when the number of items grows, as they require scanning the entire table.

B. Using Aggregation: Aggregation is a way to group and summarize data based on a specific attribute or criteria. In this scenario, we can use aggregation to calculate the number of likes for each user and return the top most friendly user. However, DynamoDB does not support aggregation natively, so we would need to use other services such as Amazon EMR or Amazon Athena to perform the aggregation.

C. Using Global Secondary Index Overloading: Global Secondary Index Overloading is a technique that involves creating a GSI with multiple attributes. In this scenario, we can create a GSI with the attributes "likes" and "userId." This index would allow us to query the number of likes for a specific user efficiently. However, if the number of likes is skewed, this approach may not be efficient, as the GSI would be overloaded with a lot of data for a single user.

D. Using Global Secondary Index Sharding: Global Secondary Index Sharding is a technique that involves creating multiple GSIs with different partition keys. In this scenario, we can create multiple GSIs with different partition keys, such as "likes" and "userId." This approach would allow us to distribute the load across multiple GSIs and query the data efficiently. However, this approach may not be cost-effective, as creating multiple GSIs can quickly become expensive.

Overall, the best option for retrieving the top most friendly user in an efficient manner would depend on the specific requirements and constraints of the system. A possible solution could involve using a combination of approaches, such as creating a GSI with multiple attributes and using aggregation to calculate the top most friendly user.