You work as a machine learning specialist for a marketing consulting firm with a new client that requires marketing data to help them determine which marketing campaign will be the most productive for their new product.
You and your machine learning team are using SageMaker Studio to create a Data Flow in SageMaker Studio.
You plan to use this Data Flow to prepare and visualize your data using SageMaker Data Wrangler.
In SageMaker Data Wrangler, you are building your data preparation pipeline.
The data you are using is gathered from marketing data providers.
You have imported your source data from your S3 bucket, and you are now configuring your transform for your dataset.
However, you have discovered that the built-in transforms provided by Data Wrangler do not meet your transformation needs.
Which options are a viable approach to building your transform in Data Wrangler? (Select TWO)
Click on the arrows to vote for the correct answer
A. B. C. D. E.Correct Answers: C and E.
Option A is incorrect.
There are only three options for the language when creating a custom transform in SageMaker Data Wrangler: Python (PySpark), Python (Pandas), and SQL (PySpark SQL).
Option B is incorrect.
You cannot customize the built-in SageMaker Data Wrangler transforms.
Option C is correct.
Python (Pandas) is one of the available languages for creating a custom SageMaker Data Wrangler transform.
Option D is incorrect.
There are only three options for the language when creating a custom transform in SageMaker Data Wrangler: Python (PySpark), Python (Pandas), and SQL (PySpark SQL).
Option E is correct.
Python (PySpark) is one of the available languages for creating a custom SageMaker Data Wrangler transform.
References:
Please see the Amazon SageMaker developer guide titled Use the Amazon SageMaker Studio Launcher (https://docs.aws.amazon.com/sagemaker/latest/dg/studio-launcher.html),
The Amazon SageMaker developer guide titled Create and Use a Data Wrangler Flow (https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-data-flow.html)
As a machine learning specialist working with SageMaker Studio and SageMaker Data Wrangler, if you have found that the built-in transforms provided by Data Wrangler do not meet your transformation needs, there are two viable approaches you can take to build your custom transform in Data Wrangler:
A. Select a custom transform step and write your custom transform in the Python (Scikit) programming language: This option allows you to write your own transformation logic using Scikit-learn library functions. This approach is useful when you need to perform custom machine learning preprocessing, such as feature scaling or one-hot encoding, and you want to apply these transformations in the Data Wrangler pipeline.
C. Select a custom transform step and write your custom transform in the Python (Pandas) programming language: This option allows you to write your own transformation logic using Pandas library functions. This approach is useful when you need to perform custom data manipulation, such as filtering or joining, and you want to apply these transformations in the Data Wrangler pipeline.
B. Choose the built-in transform that most closely resembles the transform you need and customize that built-in transform: This option allows you to customize existing built-in transforms provided by Data Wrangler to fit your transformation needs. This approach is useful when you need to modify the behavior of a built-in transform, such as changing the window size or fill value, to meet your specific requirements.
D. Selecting a custom transform step and writing your custom transform in the Scala programming language: This option is not viable for building your custom transform in Data Wrangler as it does not support Scala programming language.
E. Selecting a custom transform step and writing your custom transform in the Python (PySpark) programming language: This option is not viable for building your custom transform in Data Wrangler as it does not support PySpark programming language.
In summary, the two viable approaches to building your custom transform in SageMaker Data Wrangler are selecting a custom transform step and writing your custom transform in either the Python (Scikit) or Python (Pandas) programming language. Additionally, you can also customize existing built-in transforms to meet your specific needs.