Tiger Capital Private Equity Funds

EMRFS Consistency View and Encryption for Big Data Processing

Question

Tiger Investments (TI) is a private equity trust manager specializing in border market investments.

The Group is considered a pioneer investor in Southeast Asia's Greater Sub-region and the Caribbean.

Tiger Capital creates private equity funds targeting pre-emerging, post-conflict or post-disaster economies that are undergoing transition and are poised for rapid growth.

The funds invest commercially in basic businesses, targeting attractive economic and social returns.

Tiger Capital invests through a diversity of financial instruments including equity, and debt TI launched EMR 3.2.1 using EMRFS storage to support their real time data analytics.IT team observed that once objects are added to EMRFS in one operation and then immediately list objects in a subsequent operation, the list and the set of objects processed is incomplete most of the times.

This is a continuous problem that TI team is facing mostly when running multi-step sequential steps in extract-transform-load (ETL) data processing pipelines.

EMRFS Consistency View is enabled.

EMRFS on S3 Encryption needs to be enabled to align to enterprise security guidelines and Consistency notifications.

Please advise. Select 2 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer : B, C.

Option A is incorrect -S3 server-side encryption with KMS Key Management (SSE-KMS) is not available when using Amazon EMR release version 4.4 or earlier.

Option B is Correct - You use an AWS KMS customer master key (CMK) set up with policies suitable for Amazon EMR.

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-encryption-enable.html#emr-awskms-keys

Option C is Correct - With Amazon S3 client-side encryption, the Amazon S3 encryption and decryption takes place in the EMRFS client on your cluster.

Objects are encrypted before being uploaded to Amazon S3 and decrypted after they are downloaded.

The provider you specify supplies the encryption key that the client uses.

The client can use keys provided by AWS KMS (CSE-KMS) or a custom Java class that provides the client-side master key (CSE-C)

The encryption specifics are slightly different between CSE-KMS and CSE-C, depending on the specified provider and the metadata of the object being decrypted or encrypted.

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption-options.html

Option D is incorrect - does not support S3 server side encryption with custom materials provider.

The issue faced by TI's IT team in their EMRFS-based real-time data analytics solution is that objects added to EMRFS in one operation are not consistently listed in subsequent operations, especially in multi-step ETL pipelines. This could be due to the eventual consistency model followed by Amazon S3 and EMRFS, where changes made to objects in S3 may not be immediately reflected in EMRFS.

To address this issue and to align with enterprise security guidelines, TI needs to enable EMRFS on S3 Encryption. EMRFS on S3 Encryption provides an additional layer of security by encrypting data at rest in S3 using either server-side or client-side encryption.

Option A: S3 server-side encryption with KMS Key Management (SSE-KMS) SSE-KMS is a server-side encryption method where S3 handles the encryption of data using AWS KMS Key Management Service (KMS). SSE-KMS provides a high level of security and compliance, as well as fine-grained access controls. This option can be used to enable EMRFS on S3 Encryption with SSE-KMS, but it may not address the consistency issue faced by TI's IT team.

Option B: AWS KMS Customer Master Keys (CMKs) for EMRFS Encryption Using AWS KMS CMKs, TI can enable EMRFS on S3 Encryption with client-side encryption. This provides additional security and control over the encryption process, as TI can manage their encryption keys and policies using AWS KMS. However, like option A, this may not address the consistency issue faced by TI's IT team.

Option C: S3 client-side encryption with custom materials provider S3 client-side encryption allows TI to manage their encryption keys and policies, providing a higher level of control and security. With a custom materials provider, TI can further customize their encryption process, but this option may not be necessary to address the consistency issue.

Option D: server-side encryption with custom materials provider This option is not recommended as it requires TI to manage their encryption keys and policies, which may not provide sufficient security and compliance.

In conclusion, options A and B are recommended as they provide the required level of security and compliance, and can be used to enable EMRFS on S3 Encryption. However, they may not fully address the consistency issue faced by TI's IT team, which may require additional measures such as EMRFS Consistency View.