Extracting and Filtering Data from DynamoDB for In-House Data Warehouse

Extracting and Filtering Data from DynamoDB for In-House Data Warehouse

Prev Question Next Question

Question

Your manager has assigned you a task to write an API that will extract data from a DynamoDB table with 30 columns and more than 10,000 rows of open data.

Not all the columns are relevant for your use-case.

Hence your manager has asked you to filter out only 6 columns out of all and store them in your in-house data warehouse tool.

How can you achieve this?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: C.

Using ProjectionExpression with Scan operation is the correct answer because scan operation will fetch the entire table for you with only the columns mentioned in the ProjectionExpression attribute.

A is incorrect: DataFilter is an invalid attribute.

B is incorrect: Query operation is performed when you want to filter out data using Primary/Sort key combination, whereas, in the question, there is no such requirement.

The task wants you to filter out data based on columns.

D is incorrect: ColumnProjection is an invalid attribute.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ProjectionExpressions.html

The correct answer to the question is option C: Use ProjectionExpression operation while performing Scan operation.

Explanation: To extract data from a DynamoDB table, we can use either the Scan or Query operation. The Scan operation reads all the items in a table or a secondary index, whereas the Query operation reads one or more items with the same partition key value. Since the question does not mention any specific partition key, we can assume that we need to use the Scan operation to retrieve all the data.

The next step is to filter out only the 6 relevant columns from the retrieved data. For this, we can use the ProjectionExpression operation, which allows us to retrieve only the specified attributes from each item in the Scan or Query results.

Therefore, option C is the correct answer as it suggests using ProjectionExpression while performing a Scan operation to filter out only the required columns from the DynamoDB table.

Option A is incorrect as it mentions using the DataFilter operation, which is not a valid operation in DynamoDB.

Option B is incorrect as it suggests using ProjectionExpression while performing a Query operation. However, Query operation retrieves only items with the same partition key value, and hence it is not applicable in this scenario.

Option D is incorrect as it mentions using the ColumnProjection operation, which is not a valid operation in DynamoDB.