Customer Dimension Table: Methods for Tracking Changing Data - DP-203 Exam Preparation | Microsoft Azure

Methods for Tracking Changing Data

Question

A B2B sales startup is creating a customer dimension table.

This will be a part of their star schema model.

It is used to track customer relations.

They collect details like Tax Registration number, mobile phone number, email-id, pincode, etc.

There are some instances where the email-id and mobile phone numbers change.

Whenever a Customer updates the data at the source, there should be a separate column with a new value and should be kept for the 2 latest versions of this data.

Which of the following options says about the most suitable method?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: C

In this type of table, there will be some columns where data once entered will not change.

But here, we have to focus on the columns where data will change later.

In the question, we have entities like email-id and phone number that should be changed as per the customer's requirement.

This is clearly pointing to the usage of a Slowly Changing Dimension (SCD)

Type 3 will create a new column with the new version of data that the user has entered.

But we should always keep in mind that this SCD is not a commonly used method in the case of tables with a large number of members.

Option A is incorrect: Type to SCDs is mainly for versioning.

Option B is incorrect: This will update the latest values as the source changes.

Option C is correct: Its main purpose is to store the data versions in different columns.

Option D is incorrect: This is a combination of Type 1, 2, and 3 and is not a suitable method.

To know more about SCD Types, please refer to the doc below:

The most suitable method for the given scenario would be to use Type 2 Slowly Changing Dimension (SC).

Slowly Changing Dimensions (SCD) are used in data warehousing to track changes in dimensional data over time. There are different types of SCDs based on the way they handle changes to the dimension attributes.

Type 1 SC: This type of SCD overwrites the existing record with new information, without maintaining a history of the changes. This approach works well when the previous information is not important, and only the most recent data is relevant.

Type 2 SC: This type of SCD creates a new record with a new surrogate key and maintains the history of the changes, allowing for analysis of changes over time. This approach works well when we need to keep track of the changes in the data over time, which is the case in this scenario where we need to track changes in the email-id and mobile phone numbers.

Type 3 SC: This type of SCD maintains a single current record and stores limited history in additional columns. This approach works well when we need to track only a small number of attributes and their immediate previous values.

Type 6 SC: This is a combination of Type 1, Type 2, and Type 3 SCDs. It maintains a current record, a history of changes to some attributes, and a limited history of changes to others. This approach works well when we need to track a large number of attributes, some of which change frequently and others that change less frequently.

In this scenario, we need to keep track of changes in the email-id and mobile phone numbers. Hence, Type 2 SCD is the most suitable method as it maintains a history of the changes over time. This will allow us to analyze changes in these attributes over time and keep track of the latest two versions of the data.