You work for a management consulting firm as a machine learning specialist.
You are on a team of data scientists and other machine learning specialists.
Your team has been assigned the task of building a machine learning model to predict Return On Investment (ROI) for new potential engagements that your management consults may wish to take onto their book of business. You have a dataset of past engagements with many features that can help you define your problem as a machine learning problem.
Before you decide on which machine learning algorithms to evaluate, you wish to visualize the historical data to get an idea of the relationships between three of the key features of your dataset: ROI, investment time, and investment size. Which type of visualization would best give you an idea of the relationship between these three features?
Click on the arrows to vote for the correct answer
A. B. C. D. E. F.Answer: E.
Option A is incorrect.
A pie chart is best used to show the portion of the total for each slice of the pie.
This type of chart doesn't work well with three dimensions, such as ROI, investment time, and investment size.
Option B is incorrect.
A tree map chart also shows the portion of the total.
This type of chart is good for data with a long tail.
But it also would not work well on three dimensions.
Option C is incorrect.
Column histograms are distribution charts.
They show how data is distributed at intervals.
But you a looking for visualization to show the relationship between three variables.
Option D is incorrect.
A bar chart is a comparison chart.
These types of charts are good for showing how feature values change over time or to show a static snapshot of how different variables compare with each other.
But you are looking for the relationship between three variables, not change over time or a static snapshot comparison.
Option E is correct.
A bubble chart is a relationship chart.
For a relationship between two variables, you could use a scatter chart.
For the relationship between 3 variables, a bubble chart shows the relationships as such: x-axis for investment time, y-axis for ROI, and the bubble size for investment size.
Option F is incorrect.
A line chart is used to show a comparison of variables changing over time.
You are looking for a relationship between three variables, not how they change over time.
Reference:
Please see the AWS Data Visualization page, and the Amazon QuickSite overview page.
To visualize the relationship between three continuous variables, a 3D scatter plot or a bubble chart would be the most appropriate visualization. However, since a 3D plot or a bubble chart can be difficult to interpret and understand, a 2D plot with a color or size encoding could be used to represent the third variable.
In this case, we have three variables: ROI (continuous), investment time (continuous), and investment size (continuous). The goal is to see the relationship between these three variables.
A line chart is not the best visualization option as it is better used to show trends in a single variable over time. A pie chart and a treemap are not appropriate as they are used for categorical data. A column histogram and a bar chart are also not appropriate as they are used for a single variable, and do not allow us to visualize the relationship between multiple variables.
Therefore, the best option for visualizing the relationship between ROI, investment time, and investment size is the bubble chart. In a bubble chart, the x and y axes represent two of the continuous variables (such as investment time and investment size), and the size of the bubble or the color of the bubble represents the third continuous variable (in this case, ROI).
By plotting the data points in a bubble chart, we can easily see if there is any correlation between ROI and investment time or investment size. Additionally, we can use the size or color of the bubble to further analyze the relationship between these variables.
In summary, a bubble chart would be the most appropriate visualization for analyzing the relationship between ROI, investment time, and investment size.