Migrating Parson Fortunes Ltd's Data Warehouse to AWS: A Cost-Effective Solution for Analytics

Migrate your Data Warehouse to AWS for cost-effective analytics

Question

Parson Fortunes Ltd is an Asian-based department store operator with an extensive network of 131 stores, spanning approximately 4.1 million m2 of retail space across cities in India, China, Vietnam, Indonesia, and Myanmar. Parson has large assets of data around 10 TB's of structured data and 5 TB of unstructured data and is planning to host their data warehouse on AWS and unstructured data storage on S3

Parson IT team is well aware of the scalability, performance of AWS services capabilities.

Parson is currently using running their DWH, on-premises on Teradata and is concerned on the overall costs of the DWH on AWS.

They want to initially migrate the platform onto AWS use it for basic analytics, and don't have any performance intensive workloads in place for time being.

They have business needs around real-time data integration, data driven analytics as a roadmap of 5 years.

Currently the number of users accessing the application would be around 100

What is your suggestion? select 1 option.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer : A.

Option A is correct - DS2 node types are optimized for large data workloads and use hard disk drive (HDD) storage.DS2.xlarge fulfills the requirements since it provides massive parallel processing using multiple nodes.

Based on the amount of data loaded, this is the right option.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Option B is INCORRECT ,since , although cost-wise both the ds2.xlarge and ds2.8xlarge are the same, deploying a ds2.xlarge instance would be just sufficient as per the requirements of the question

Please refer the following link.

https://aws.amazon.com/blogs/aws/amazon-redshift-now-faster-and-more-cost-effective-than-ever/

Option C is incorrect - DC2 node types are optimized for performance-intensive workloads.

This is not the requirement.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Option D is incorrect - DC2 node types are optimized for performance-intensive workloads.

This is not the requirement.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Parson Fortunes Ltd, an Asian-based department store operator, plans to migrate their on-premises data warehouse (DWH) to AWS. They have 10 TB of structured data and 5 TB of unstructured data and plan to host their DWH on AWS and store unstructured data on S3. Their IT team is well aware of AWS's scalability and performance capabilities, but they are concerned about the overall costs of their DWH on AWS. They plan to use the platform for basic analytics with no performance-intensive workloads for the time being. They also have business needs around real-time data integration and data-driven analytics as a roadmap for the next 5 years. The number of users accessing the application would be around 100.

Given this scenario, the best suggestion is to launch a Redshift cluster with node types DC2.xlarge to fulfill the requirements (Option C).

Redshift is a fully managed data warehouse service in AWS that can handle petabyte-scale data warehousing. It is optimized for querying and analyzing large datasets, making it a suitable option for Parson Fortunes Ltd.

Node types are used to determine the CPU, memory, storage, and network capacity of a Redshift cluster. The four node types mentioned in the options are DS2.xlarge, DS2.8xlarge, DC2.xlarge, and DC2.8xlarge.

DS2.xlarge and DS2.8xlarge are dense-storage node types that are suitable for workloads that require a lot of storage. However, since Parson Fortunes Ltd plans to use the platform for basic analytics with no performance-intensive workloads, it is not necessary to choose a dense-storage node type.

DC2.xlarge and DC2.8xlarge are compute-optimized node types that are suitable for workloads that require high performance. Since Parson Fortunes Ltd has plans for real-time data integration and data-driven analytics as a roadmap for the next 5 years, it is recommended to choose a compute-optimized node type.

Furthermore, since the number of users accessing the application would be around 100, a smaller node type such as DC2.xlarge would suffice for their needs. Therefore, launching a Redshift cluster with node types DC2.xlarge would be the best suggestion for Parson Fortunes Ltd.