Tiger Investments (TI) is a private equity trust manager specializing in border market investments.
The Group is considered a pioneer investor in Southeast Asia's Greater Sub-region and the Caribbean.
Tiger Capital creates private equity funds targeting pre-emerging, post-conflict or post-disaster economies that are undergoing transition and are poised for rapid growth. The funds invest commercially in basic businesses, targeting attractive economic and social returns.
Tiger Capital invests through a diversity of financial instruments including equity, and debt TI is planning to launch EMR cluster to complement their ETL workloads running on Data Pipeline.
The Team is looking for storing persistent data complemented with server-side encryption, read-after-write consistency, and list consistency and enables Data Lake for the enterprise to support analytics.
Select 1 option.
Click on the arrows to vote for the correct answer
A. B. C. D.Answer: B.
Option A is incorrect - Provides Ephemeral storage can be enabled through HDFS.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.htmlOption B is correct - Provides the convenience of storing persistent data in Amazon S3 for use with Hadoop while also providing features like Amazon S3 server-side encryption, read-after-write consistency, and list consistency.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.htmlOption C is incorrect - Each node is created from an EC2 instance that comes with a preconfigured block of pre-attached disk storage called an instance store.
Data on instance store volumes persists only during the life of its EC2 instance.
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.htmlOption D is incorrect - This is same as above defined in option.
C.
The local file system refers to a locally connected disk.
When you create a Hadoop cluster, each node is created from an Amazon EC2 instance that comes with a preconfigured block of pre-attached disk storage called an instance store.
Data on instance store volumes persists only during the lifecycle of its Amazon EC2 instance.
The best option for Tiger Investments to store persistent data complemented with server-side encryption, read-after-write consistency, and list consistency and enable Data Lake for the enterprise to support analytics is:
B. EMRFS implementation of HDFS used for reading and writing regular files from Amazon EMR directly to Amazon S3
Explanation:
EMRFS (Elastic MapReduce File System) is a consistent view of Amazon S3 data from multiple Amazon EMR clusters. It allows for seamless integration between Hadoop applications running on EMR and data stored on S3. Using EMRFS, Tiger Investments can store their persistent data in Amazon S3, which provides low-cost and durable storage. Amazon S3 also supports server-side encryption to protect the data at rest.
EMRFS is an implementation of HDFS, which provides read-after-write consistency and list consistency, meaning that once data is written to S3, subsequent read operations will return the updated data, and data listing operations will show all the data that was written.
Using EMRFS to access S3 data also provides several benefits, including:
In summary, using EMRFS to access S3 data provides a scalable, durable, and cost-effective storage solution for Tiger Investments' persistent data needs while also providing the desired consistency and encryption options.