You have data files in Azure Blob storage.
You plan to transform the files and move them to Azure Data Lake Storage.
You need to transform the data by using mapping data flow.
Which Azure service should you use?
Click on the arrows to vote for the correct answer
A. B. C. D.D
You can use Copy Activity in Azure Data Factory to copy data from and to Azure Data Lake Storage Gen2, and use Data Flow to transform data in Azure Data
Lake Storage Gen2.
https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-data-lake-storageThe correct answer for the given question is option D, Azure Data Factory.
Azure Data Factory (ADF) is a cloud-based data integration service that allows you to create, schedule, and orchestrate data-driven workflows. ADF enables you to move data from various sources to various destinations, and also allows you to transform data by using built-in transformations or custom transformations.
In the given scenario, you have data files in Azure Blob storage that need to be transformed and moved to Azure Data Lake Storage. ADF allows you to do this by using Mapping Data Flows, which is a visual data transformation tool. With Mapping Data Flows, you can create data transformations without writing any code, and you can easily integrate with various data sources and destinations.
To perform the transformation using Mapping Data Flows in ADF, you would first create an Azure Data Factory pipeline that moves the data from Azure Blob storage to Azure Data Lake Storage. You can then add a Mapping Data Flow activity to the pipeline, and use the drag-and-drop interface to create the transformation logic.
Option A, Azure Storage Sync, is a service that allows you to sync files between on-premises file servers and Azure Blob storage. Option B, Azure Databricks, is a service that provides a fast, easy, and collaborative Apache Spark-based analytics platform. Option C, Azure Data Box Gateway, is a service that allows you to create a virtual appliance to act as a gateway to transfer data to Azure Storage.
Therefore, the correct option to use for the given scenario is option D, Azure Data Factory.