AWS Certified Machine Learning - Specialty: Encrypting Data at Rest in a Kinesis Data Firehose and S3 Solution

Achieving Encryption at Rest in a Kinesis Data Firehose and S3 Solution

Question

You work as a machine learning specialist at a credit card transaction processing company.

You have built a data streaming pipeline using Kinesis Data Firehose and S3

Due to the personally identifiable information contained in your data stream, your data must be encrypted in flight and at rest.

How should you configure your solution to achieve encryption at rest?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: C.

Option A is incorrect.

Encrypting the data at the Kinesis consumer application level does not allow for encryption at the S3 bucket.

Once the data has reached the consumer application, it has already been stored in S3 without being encrypted.

Option B is incorrect.

Kinesis Data Firehose does not use SSE-S3

It uses SSE-KMS.

(See the Amazon Kinesis Data Firehose developer documentation titled Configure Settings)

Option C is correct.

The Kinesis Data Firehose documentation states that “Kinesis Data Firehose supports Amazon S3 server-side encryption with AWS Key Management Service (AWS KMS) for encrypting delivered data in Amazon S3

You can choose not to encrypt the data or encrypt with a key from the list of AWS KMS keys that you own.

For more information, see Protecting Data Using Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS).”

Option D is incorrect.

Kinesis Data Firehose does not use 256-bit AES-GCM with HKDF.

It uses SSE-KMS.

(See the Amazon Kinesis Data Firehose developer documentation titled Configure Settings)

Reference:

Please see the Amazon Kinesis Data Firehose developer guide documentation titled Creating an Amazon Kinesis Data Firehose Delivery Stream and the Amazon Kinesis Data Streams developer guide documentation titled What is Server-Side Encryption for Kinesis Data Streams.

To achieve encryption at rest for the data streaming pipeline using Kinesis Data Firehose and S3, we can use server-side encryption (SSE) provided by S3. There are multiple SSE options provided by S3 to encrypt data at rest, including Amazon S3-managed encryption keys (SSE-S3), customer-provided encryption keys (SSE-C), and AWS Key Management Service (KMS) encryption keys (SSE-KMS).

However, as personally identifiable information (PII) is involved in the data stream, using customer-provided encryption keys (SSE-C) is not recommended because it requires managing the keys and can lead to potential security risks. Therefore, the recommended approach is to use either SSE-S3 or SSE-KMS to encrypt data at rest.

SSE-S3 is the simplest option provided by S3, which automatically encrypts data at rest using Amazon S3-managed encryption keys. With this option, data is encrypted before it is written to disk and decrypted when it is read. However, with SSE-S3, we have limited control over the encryption keys, and S3 manages the keys, making it less secure than SSE-KMS.

On the other hand, SSE-KMS provides greater control and security over the encryption keys. SSE-KMS uses the AWS Key Management Service (KMS) to manage encryption keys, which provides additional benefits, such as auditing, key rotation, and revocation. With SSE-KMS, data is encrypted using a unique customer master key (CMK), and KMS manages the encryption keys.

Therefore, the recommended approach to achieve encryption at rest for the data streaming pipeline is to configure Firehose to use S3 server-side encryption with AWS Key Management Service (SSE-KMS), option C. This ensures that the data is encrypted before it is written to S3, and the encryption keys are securely managed by KMS.

Option A, Encrypt the data at the data consumer application level, is not recommended as it would require additional development effort and might lead to security vulnerabilities.

Option D, Encrypt the data by configuring Firehose to use S3 server-side encryption with 256-bit AES-GCM with HKD, is not a valid option as HKD is not an encryption algorithm provided by S3 for SSE.