Ricky is a Cloud Engineer & engaged to build Azure Databricks visualization dashboard.
He's required to implement a visual which can show a set of execution metrics over a given task's execution; he should include the metrics on size & duration of data shuffling, data serialization & deserialization durations.
Which type of Azure Databricks monitoring visual can he create in this scenario?
Click on the arrows to vote for the correct answer
A. B. C. D.Correct Answer: C.
Based on the scenario described, Ricky is required to create a visual that can show a set of execution metrics over a given task's execution in Azure Databricks. The metrics to be included are the size and duration of data shuffling, as well as the duration of data serialization and deserialization.
Out of the four options provided, the most appropriate monitoring visual that Ricky can create for this scenario is "Task Metrics" (option C). Task Metrics is a visualization option in Azure Databricks that provides detailed information about the performance of individual tasks that are part of a job run.
Task Metrics can provide insights into task execution time, the amount of data processed, the number of records processed, and the number of shuffle reads and writes performed. In Ricky's case, he can use Task Metrics to specifically monitor the size and duration of data shuffling, data serialization, and deserialization durations for each task.
Job Latency (option A) and Task Latency (option B) are other monitoring visuals available in Azure Databricks. Job Latency provides information about the duration of a job run, while Task Latency provides information about the duration of individual tasks within a job run. However, these options do not provide the level of detail needed to monitor the specific metrics required in Ricky's scenario.
Sum Task Execution per Host (option D) provides information about the total execution time of tasks grouped by host. This visualization option can be useful in identifying hosts that may be experiencing performance issues or bottlenecks. However, it does not provide the level of detail needed to monitor the specific metrics required in Ricky's scenario.
In summary, Ricky should choose Task Metrics (option C) to create the required visualization for monitoring the size and duration of data shuffling, data serialization, and deserialization durations for each task.