Econometric Theory/Multiple Regression Analysis

Our first regressions (MLE and OLS) were bivariate. Our lines were simple, two variable lines. However, in most economic data, there are a multitude of possible independent things that can effect a dependent variable. So we can expand our explanatory functions to allow multiple independent variables.

Instead of our functions looking like $$Y_=\alpha + \beta X_i + \epsilon_i$$, our functions look like $$Y_i = \beta_0 + \beta_1 X_{1,i} + \beta_2 X_{2,i} + \cdots + \beta_n X_{n,i} + \epsilon_i $$

By adding more variables and data to our model, we can hopefully get a better fit and understanding of the dependent variable. However, with the added variables come added problems that will misguide our model.

Goodness of Fit
When we move to the multiple regression case, our goodness of fit looks much like it previously did in the bivariate case. TSS = ESS + RSS. Our R² = ESS/TSS) = 1 - (RSS/TSS). We can still use our Coefficient of Determination, R² (R² = ESS/TSS = 1 - (RSS/TSS)), but there is a problem associated with it. R² will never decrease because of an addition of a variable, whether or not it helps us explain our dependent variable. When we add a new variable to the function, the ESS is calculated over a larger set of variables, and ESS will be greater than or equal to what we had before. This will cause our R² to increase, even if the addition of our new variable hurts our model. There is a tool to fix this problem. R² is replaced with Adjusted R² which adjusts it for the added degrees of freedom. Adjusted R² is signified by adding a bar above the 'R.'

$$ \bar{R^2} = 1 - (\frac{\hat{var(\epsilon)}} {\hat{var(Y)}})$$