Register
132:00
Stop
CFA Level 2
Intermediate
1/88
An analyst is building a multiple regression model to predict the price of a house based on various factors. The dataset includes the following variables: house price (dependent variable), square footage, number of bedrooms, number of bathrooms, age of the house, and proximity to the city center (independent variables). The analyst runs the regression and obtains an R-squared of 0.85 and an adjusted R-squared of 0.83. The F-statistic is significant at the 1% level, and the residual plots show no clear patterns. However, when the analyst checks the variance inflation factors (VIF), they find that the VIF for square footage is 8.5 and the VIF for the number of bedrooms is 7.2. Which of the following is the most appropriate interpretation of this finding?
a.
The high VIF values for square footage and the number of bedrooms suggest that these variables are not linearly related to the dependent variable, house price, and should be transformed or removed from the model to improve its specification and predictive performance.
b.
The high VIF values for square footage and the number of bedrooms suggest the presence of multicollinearity between these independent variables, which may affect the reliability of the individual coefficient estimates but does not impact the overall predictive power of the model.
c.
The high VIF values for square footage and the number of bedrooms indicate that these variables are highly correlated with the dependent variable, house price, which is a desirable property in a multiple regression model and enhances the model's predictive accuracy.
Intermediate