Econometric Theory/Serial Correlation

There are times, especially in time-series data, that the CLR assumption of $$ corr(\epsilon_t, \epsilon_{t-1})=0 $$ is broken. This is known in econometrics as Serial Correlation or Autocorrelation. This means that $$ corr(\epsilon_t, \epsilon_{t-1}) \ne 0 $$ and there is a pattern across the error terms. The error terms are then not independently distributed across the observations and are not strictly random.

Examples of Autocorrelation
$$ corr(\epsilon_t, \epsilon_{t-1})>0 $$

$$ corr(\epsilon_t, \epsilon_{t-1})<0 $$

Functional Form
When the error term is related to the previous error term, it can be written in an algebraic equation. $$ \epsilon_t = \rho \epsilon_{t-1} + u_t $$ where ρ is the autocorrelation coefficient between the two disturbance terms, and u is the disturbance term for the autocorrelation. This is known as an Autoregressive Process. $$-1< \rho = corr(\epsilon_t, \epsilon_{t-1}) < 1$$ The u is needed within the equation because although the error term is less random, it still has a slight random effect.

Autoregressive model

 * First order Autoregressive Process, AR(1):$$ \epsilon_t = \rho \epsilon_{t-1} + u_t $$
 * This is known as the first order autoregression, due to the error term only depending on the previous error term.
 * nth order Autoregressive Process, AR(n):$$ \epsilon_t = \rho_1 \epsilon_{t-1} + \rho_2 \epsilon_{t-2} + \cdots + \rho_n \epsilon_{t-n} + u_t $$

Moving-average model
The notation MA(q) refers to the moving average model of order q:


 * $$ X_t = \mu + \varepsilon_t + \sum_{i=1}^q \theta_i \varepsilon_{t-i}\,$$

where the θ1, ..., θq are the parameters of the model, μ is the expectation of $$X_t$$ (often assumed to equal 0), and the $$\varepsilon_t$$, $$\varepsilon_{t-1}$$,... are again, white noise error terms. The moving-average model is essentially a finite impulse response filter with some additional interpretation placed on it.

Autoregressive–moving-average model
The notation ARMA(p, q) refers to the model with p autoregressive terms and q moving-average terms. This model contains the AR(p) and MA(q) models,


 * $$ X_t = c + \varepsilon_t + \sum_{i=1}^p \varphi_i X_{t-i} + \sum_{i=1}^q \theta_i \varepsilon_{t-i}.\,$$

Causes of Autocorrelation
$$ corr(\epsilon_t, \epsilon_{t-1}) \ne 0 $$ Spatial Autocorrelation occurs when the two errors are specially and/or geographically related. In simpler terms, they are "next to each." Examples: The city of St. Paul has a spike of crime and so they hire additional police. The following year, they found that the crime rate decreased significantly. Amazingly, the city of Minneapolis, which had not adjusted its police force, finds that they have an increase in the crime rate over the same period.
 * 1) Spatial Autocorrelation
 * Note: this type of Autocorrelation occurs over cross-sectional samples.
 * 1) Inertia/Time to Adjust
 * 2) This often occurs in Macro, time series data. The US interest rate unexpectedly increases and so there is an associated change in exchange rates with other countries. Reaching a new equilibrium could take some time.
 * 3) Prolonged Influences
 * 4) This is again a Macro, time series issue dealing with economic shocks. It is now expected that the US interest rate will increase. ##The associated exchange rates will slowly adjust up-until the announcement by the Federal Reserve and may overshoot the equilibrium.
 * 5) Data Smoothing/Manipulation
 * 6) Using functions to smooth data will bring autocorrelation into the disturbance terms
 * 7) Misspecification
 * 8) A regression will often show signs of autocorrelation when there are omitted variables. Because the missing independent variable now exists in the disturbance term, we get a disturbance term that looks like: $$ \epsilon_t = \beta_2 X_2 + u_t $$ when the correct specification is $$ Y_t = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + u_t $$

Consequences of Autocorrelation
The main problem with autocorrelation is that it may make a model look better than it actually is.

List of consequences

 * 1) Coefficients are still unbiased $$E(\epsilon_t) = 0, cov(X_t, u_t) = 0 $$
 * 2) True variance of $$ \hat{\beta} $$ is increased, by the presence of autocorrelations.
 * 3) Estimated variance of $$ \hat{\beta} $$ is smaller due to autocorrelation (biased downward).
 * 4) A decrease in $$ se(\hat{\beta}) $$ and an increase of the t-statistics; this results in the estimator looking more accurate than it actually is.
 * 5) R² becomes inflated.

All of these problems result in hypothesis tests becoming invalid.



Testing for Autocorrelation

 * 1) While not conclusive, an impression can be gained by viewing a graph of the dependent variable against the error term (namely, a residual scatter-plot).
 * 2) Durbin-Watson test:
 * 3) Assume $$ \epsilon_t = \epsilon_{t-1} \rho + u_t $$
 * 4) Test H(0): ρ = 0 (no AC) against H(1): ρ > 0 (one-tailed test)
 * 5) Test statistic $$ DW = \frac{\sum (\epsilon_t - \epsilon_{t-1})^2} {\sum \epsilon^2} = 2 - 2 \rho $$
 * Any value under D(L) (in the D-W table) rejects the null hypothesis and AC exists.
 * Any value between D(L) and D(W) leaves us with no conclusion of AC.
 * Any value larger than D(W) accepts the null hypothesis and AC does not exist.
 * Note, this is one tail test. To get the other tail. Use 4 - DW as the test stat instead.