Amazon SOA-C02: AWS Certified SysOps Administrator - Associate

How to Ensure File Integrity and Minimize Costs When Uploading Files to AWS Glacier

Question

As a Sysops Administrator, you are uploading files from your data center onto AWS Glacier.

The total amount of data in your data center is approximately 70 TB.

But when you upload files, you can see the files' names are not the same as those that have been uploaded.

For example, you upload a file demo.txt, and then in Glacier, you can see the archive ID as “TJgHcrOSfAkV6hdPqOATYfp_0ZaxL1pIBOc02iZ0gDPMr2ig-nhwd_PafstsdIf6HSrjHnP-3p6LCJClYytFT_CBhT9CwNxbRaM5MetS3I-GqwxI3Y8QtgbJbhEQPs0mJ3KExample” What can be done to ensure that you can have the same file in Glacier ensuring minimal impact to cost? Choose 2 answers from the options given below.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

Correct Answers: C and E.

You can upload the files to Amazon S3 either directly or by using a Snowball Edge device.

The Lifecycle policies can then be used to transition the storage class of the object to Glacier.

So here, the file names will be preserved.

Option A is invalid because this would not be a cost-effective and easy maintenance option.

Option B is invalid because you cannot upload files directly to Glacier.

Option D is invalid because Direct Connect does not resolve the issue mentioned in the question.

For more information on uploading an archive, please refer to the below URL-

https://docs.aws.amazon.com/amazonglacier/latest/dev/uploading-an-archive.html

The issue here is that when uploading files to AWS Glacier, the uploaded files have different names (archive IDs) than the original files. This can make it difficult to locate and retrieve files in the future. To address this issue, there are a few options available:

Option A: Create a PostGreSQL database and have an entry for files names in Glacier against the action file names.

This option involves creating a database to maintain a mapping between the original file names and the archive IDs generated by Glacier. This approach requires additional infrastructure and maintenance but can provide a more user-friendly way to locate and retrieve files.

Option B: Upload the files directly to Glacier using the AWS Console.

By using the AWS Console, files can be uploaded directly to Glacier with the desired names. This approach requires manual effort and may not be practical for large amounts of data.

Option C: Use the AWS Snowball Edge device to upload the files to S3 and then move it to Glacier storage for archival using Lifecycle rules.

The AWS Snowball Edge is a device that can be used to physically transfer large amounts of data to AWS. In this approach, the data would be uploaded to the Snowball Edge, which would then be shipped to AWS for loading into S3. Once the data is in S3, lifecycle rules can be used to automatically move it to Glacier for archival.

Option D: Upload the files directly to Glacier through Direct Connect.

Direct Connect is a service that provides dedicated network connections between an organization's data center and AWS. This approach would allow for faster and more reliable uploads directly to Glacier, but it requires additional infrastructure and setup.

Option E: Upload the files in Amazon S3 Infrequent Access and then use lifecycle policies to move them to Glacier.

Amazon S3 Infrequent Access is a storage class that is designed for infrequently accessed data. In this approach, the data would be uploaded to S3 Infrequent Access, and then lifecycle policies could be used to automatically move it to Glacier for long-term archival.

In summary, options B and E are the most practical solutions as they require the least additional infrastructure and maintenance. Option C could also be a good choice for large amounts of data that cannot be uploaded directly to Glacier.