Migrating Parson Fortunes Ltd's Data Warehouse to AWS: Benefits, Costs, and Roadmap

Migrate Data Warehouse to AWS for Parson Fortunes Ltd

Question

Parson Fortunes Ltd is an Asian-based department store operator with an extensive network of 131 stores, spanning approximately 4.1 million m2 of retail space across cities in India, China, Vietnam, Indonesia, and Myanmar. Parson has large assets of data around 10 TB's of structured data and 5 TB of unstructured data and is planning to host their data warehouse on AWS and unstructured data storage on S3

Parson IT team is well aware of the scalability, performance of AWS services capabilities.

Parson is currently using running their DWH, on-premises on Teradata and is concerned on the overall costs of the DWH on AWS.

They want to initially migrate the platform onto AWS use it for basic analytics, and don't have any performance intensive workloads in place for time being.

They have business needs around real-time data integration, data driven analytics as a roadmap of 5 years.

Currently the number of users accessing the application would be around 100

What is your suggestion? select 1 option.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer : A.

Option A is correct - DS2 node types are optimized for large data workloads and use hard disk drive (HDD) storage.DS2.xlarge fulfills the requirements since it provides massive parallel processing using multiple nodes.

Based on the amount of data loaded, this is the right option.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Option B is INCORRECT ,since , although cost-wise both the ds2.xlarge and ds2.8xlarge are the same, deploying a ds2.xlarge instance would be just sufficient as per the requirements of the question

Please refer the following link.

https://aws.amazon.com/blogs/aws/amazon-redshift-now-faster-and-more-cost-effective-than-ever/

Option C is incorrect - DC2 node types are optimized for performance-intensive workloads.

This is not the requirement.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Option D is incorrect - DC2 node types are optimized for performance-intensive workloads.

This is not the requirement.

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-about-clusters-and-

Based on the requirements, it is recommended to launch a Redshift cluster on AWS to host the data warehouse. Redshift is a fully managed, petabyte-scale data warehouse service that can handle large amounts of structured data efficiently and cost-effectively. Redshift is designed to scale up and down seamlessly as per the demand and provides superior performance by leveraging columnar storage and massively parallel processing capabilities.

Node types refer to the hardware configuration of the cluster nodes that Redshift uses to store data and perform query operations. The node type selected should be based on the workload and expected usage patterns of the data warehouse.

Option A: Launch Redshift cluster with node types DS2.xlarge to fulfill the requirements The DS2.xlarge node type is a small instance type with 8 vCPUs, 61 GB of RAM, and 6 TB of attached storage. This node type is suitable for smaller workloads with less data volume and a small number of concurrent users. However, based on the estimated data size and the number of users, it may not be sufficient to handle the initial requirements of Parson Fortunes Ltd.

Option B: Launch Redshift cluster with node types DS2.8xlarge to fulfill the requirements The DS2.8xlarge node type is a larger instance type with 32 vCPUs, 244 GB of RAM, and 24 TB of attached storage. This node type is suitable for larger workloads with higher data volume and a larger number of concurrent users. This node type can provide good performance and scalability for Parson Fortunes Ltd's requirements, but it may be an overkill for the current requirements and may result in higher costs.

Option C: Launch Redshift cluster with node types DC2.xlarge to fulfill the requirements The DC2.xlarge node type is a smaller instance type with 8 vCPUs, 244 GB of RAM, and 6.4 TB of attached storage. This node type is designed for dense compute workloads and is optimized for performance and storage capacity. It can handle larger workloads with higher concurrency and is suitable for data warehousing and analytics workloads. This node type can provide good performance and scalability for Parson Fortunes Ltd's requirements at a reasonable cost.

Option D: Launch Redshift cluster with node types DC2.8xlarge to fulfill the requirements The DC2.8xlarge node type is a larger instance type with 32 vCPUs, 244 GB of RAM, and 51.2 TB of attached storage. This node type is designed for dense compute workloads with high storage capacity and is suitable for large data warehousing and analytics workloads. This node type provides excellent performance and scalability for large data volumes and high concurrency. However, it may be an overkill for the current requirements and may result in higher costs.

Therefore, based on the requirements stated, it is recommended to select Option C, Launch Redshift cluster with node types DC2.xlarge to fulfill the requirements. This node type can handle larger workloads with higher concurrency and is optimized for performance and storage capacity. It can provide good performance and scalability for Parson Fortunes Ltd's requirements at a reasonable cost.