Azure Data Lake Storage - Ideal Solution for File System Storage with Reduced TCO

Azure Data Lake Storage: The Perfect Choice for File System Storage with Reduced TCO

Question

Jennifer is a Cloud Engineer of Fabrikum LLC.

She works with Data engineers and analysts to help build object and file system storage for ingestion of structured datasets from ERP, CRM/SAP data sources.

She has the requirement to provision the data storage which is suitable for file system and can manipulate directories with the reduced total cost of operation (TCO)

She needs to choose the storage solution.

Which kind of storage can she choose?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: A.

Jennifer needs to provision a data storage solution that can accommodate structured datasets from ERP, CRM/SAP data sources and can manipulate directories with a reduced total cost of operation (TCO). Among the given options, Azure Data Lake Storage Gen2 with the hierarchical namespace is the most suitable choice.

Here's why:

A. Azure Data Lake Storage Gen2 with hierarchical namespace is a file storage solution that is designed to store and analyze big data analytics workloads such as ERP, CRM/SAP data sources. It can store any type of data including structured, semi-structured, and unstructured data, making it a good fit for Jennifer's requirements. It also supports the hierarchical namespace, which allows for the creation of directories and subdirectories, making it easy to organize and manipulate data. Additionally, it has built-in support for Hadoop Distributed File System (HDFS), allowing for easy integration with big data analytics tools like Apache Spark, Hive, and Hadoop. The cost of Azure Data Lake Storage Gen2 is competitive and offers flexibility for storage management.

B. Blob storage is a highly scalable object storage solution designed for unstructured data such as images, videos, and documents. While it can store structured data, it is not optimized for this type of workload. Additionally, it does not support directories in the same way that hierarchical file systems do. Blob storage is best suited for applications that require massive scalability and low latency, but it is not a good fit for Jennifer's requirements.

C. Table storage is a NoSQL key-value store designed for semi-structured data such as JSON documents. It is optimized for querying large datasets quickly, but it is not designed to store structured data from ERP, CRM/SAP data sources. Partition keys help to manage the data, but it is not useful to manipulate directories.

D. Azure File storage is a fully managed file share solution in the cloud. It is optimized for SMB file share and supports REST APIs. It is suitable for storing and sharing files across different platforms, but it is not optimized for big data analytics workloads.

Therefore, Azure Data Lake Storage Gen2 with hierarchical namespace is the best choice for Jennifer's requirements.