Linear Regression Assumptions | CFA Level 1 Exam Preparation

Assumptions Underlying Linear Regression

Prev Question Next Question

Question

Which of the following are true assumptions underlying linear regression?

(1) For each value of X, there is a group of Y values which are normally distributed

(2) The means of these normal distributions of Y values all lie on the straight line of regression

(3) The standard deviations of these normal distributions are equal

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

A

All are part of the assumptions of the linear regression. Make sure you know all the assumptions.

The assumptions underlying linear regression are important for understanding the validity and reliability of the regression model. Let's examine each statement in the question and determine which assumptions are true.

(1) For each value of X, there is a group of Y values which are normally distributed: This assumption is known as the assumption of normality or normal distribution of residuals. In linear regression, it is assumed that the errors or residuals (the differences between the observed Y values and the predicted Y values) follow a normal distribution. This assumption is necessary for hypothesis testing and constructing confidence intervals. It allows for the use of statistical tests and estimation techniques that rely on the assumption of normality. Therefore, statement (1) is true.

(2) The means of these normal distributions of Y values all lie on the straight line of regression: This assumption is known as the assumption of linearity. It assumes that the relationship between the independent variable(s) (X) and the dependent variable (Y) can be described by a straight line. In other words, the average value of Y at any given value of X lies on the regression line. This assumption is fundamental to linear regression modeling, and it allows for the estimation of the slope and intercept of the regression line. Therefore, statement (2) is true.

(3) The standard deviations of these normal distributions are equal: This assumption is known as the assumption of homoscedasticity or constant variance of residuals. It assumes that the spread or dispersion of the residuals is constant across all levels of the independent variable(s). Homoscedasticity is important because it ensures that the errors have a consistent level of variability, and it allows for accurate estimation of the model's parameters. If heteroscedasticity is present (meaning the variability of residuals changes systematically with the independent variable), it can lead to biased parameter estimates and invalid inferences. Therefore, statement (3) is false.

To summarize, statements (1) and (2) are true assumptions underlying linear regression, while statement (3) is not. Therefore, the correct answer is (E) Only (1) and (2).