A company needs to stream logs using the AWS Kinesis Firehose service.
They need to decide on a data store.
The resulting files on the data store will be heavily queried over a week's period time and after that can be archived for future analysis.
Which of the following would be the ideal steps to implement the data store? Please choose 2 correct Options.
Click on the arrows to vote for the correct answer
A. B. C. D.Answer - A and C.
The ideal way is to use Lifecycle policies for S3
The AWS Documentation mentions the following on S3 Lifecycle policies.
To manage your objects so that they are stored cost effectively throughout their lifecycle, configure their lifecycle.
A lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a group of objects.
There are two types of actions:
Transition actions-Define when objects transition to another storage class.
For example, you might choose to transition objects to the STANDARD_IA storage class 30 days after you created them, or archive objects to the GLACIER storage class one year after creating them.
Expiration actions-Define when objects expire.
Amazon S3 deletes expired objects on your behalf.
The lifecycle expiration costs depend on when you choose to expire objects.
For more information on S3 Lifecycle management, please refer to the below URL.
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.htmlOption A - Ensure the destination for Kinesis Firehose is marked as S3:
Amazon Kinesis Firehose is a fully managed service for delivering real-time streaming data to destinations like Amazon S3, Amazon Redshift, or Amazon Elasticsearch Service. When Firehose delivers the data to S3, it stores the data as objects in the specified Amazon S3 bucket. As a result, option A is a valid step in implementing the data store.
S3 is an object storage service that provides durability, availability, and scalability at a low cost. S3 buckets can be configured with lifecycle policies that can automate the transition of objects from one storage class to another, or from one S3 bucket to another, after a specified period. This makes S3 a suitable choice for the storage of the logs that are delivered by Kinesis Firehose.
Option C - Create a Lifecycle policy for S3 to archive older files:
As mentioned above, S3 buckets can be configured with lifecycle policies that can automate the transition of objects from one storage class to another, or from one S3 bucket to another, after a specified period. In this case, after the files have been queried heavily for a week, they can be moved from the S3 standard storage class to the S3 Glacier storage class, which is designed for long-term archival and backup of data.
This can help to optimize costs by moving the files to a lower-cost storage class, while still retaining the ability to access them if needed. As a result, option C is also a valid step in implementing the data store.
Option B - Ensure the destination for Kinesis Firehose is marked as Redshift:
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. It provides an efficient way to store and query large amounts of structured data, using SQL queries. However, in this case, it may not be the best choice of destination for Kinesis Firehose, as Redshift is typically used for analytics workloads that require complex queries and joins.
In this case, it may be more appropriate to store the data in S3, and then use Amazon Athena, which is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Alternatively, Amazon Redshift Spectrum can also be used to run SQL queries directly against data stored in S3.
Option D - Create a Job to move older data from the Redshift table:
As mentioned above, Redshift may not be the best choice of destination for Kinesis Firehose in this case. Therefore, option D is not a valid step in implementing the data store.
In summary, the ideal steps to implement the data store would be to ensure the destination for Kinesis Firehose is marked as S3, and to create a Lifecycle policy for S3 to archive older files.