Hugh is a Data Analyst of Woodgrove Inc.
He's working on data orchestration using Azure data factory to copy data from the Azure data lake storage Gen2 to Databricks & transformation.
In the pipeline, he's using ADF to build complex solutions with data flow's schema drift feature & applying reusable patterns based on flexible dataset schemas.
He needs to apply schema drift in the source settings of Azure Data factory to define the source data flow as drifted.
The schema drift is defined as reading columns that aren't defined in the dataset schema.
He enabled the Sampling option in the source settings of the Data factory pipeline.
Does the solution meet the requirements of enabling “schema drift” in the Data Factory pipeline?
Click on the arrows to vote for the correct answer
A. B.Correct Answer: B.
Based on the information provided in the question, the solution does meet the requirement of enabling "schema drift" in the Data Factory pipeline.
Schema drift occurs when the columns in the source data are not defined in the dataset schema. To enable schema drift in Azure Data Factory (ADF), you need to define the source settings in the pipeline. In this case, Hugh is using ADF to copy data from Azure Data Lake Storage Gen2 to Databricks and transform it using data flow's schema drift feature.
To enable schema drift in ADF, Hugh needs to apply the Sampling option in the source settings of the Data factory pipeline. The Sampling option allows ADF to scan a portion of the source data and infer the schema. By enabling Sampling, ADF can read columns that are not defined in the dataset schema and allow for schema drift.
Therefore, based on the information provided, the solution does meet the requirements of enabling "schema drift" in the Data Factory pipeline, as Hugh has enabled the Sampling option in the source settings of the pipeline.