Econometric Theory/Asymptotic Convergence

Convergence in Probability
Convergence in probability is going to be a very useful tool for deriving asymptotic distributions later on in this book. Alongside convergence in distribution it will be the most commonly seen mode of convergence.

Definition
A sequence of random variables $$\{ X_n ; n=1,2, \cdots \} $$ converges in probability to $$X_{ }$$ if:

an equivalent statement is:

This will be written as either $$X_n \begin{matrix} \begin{matrix} { }_p \\ \longrightarrow \\{ } \end{matrix} \end{matrix} X$$ or $$\operatorname{plim} X_n = X$$.

Example
$$X_n = \begin{cases} \eta & 1- \begin{matrix} \frac{1}{n} \end{matrix} \\ \theta & \begin{matrix} \frac{1}{n} \end{matrix} \end{cases}$$

We'll make an intelligent guess that this series converges in probability to the degenerate random variable $$\eta$$. So we have that:

$$\forall \delta >0,\; \Pr \{ |X_n - \eta| > \delta \} \leq \Pr \{ |X_n - \eta| > 0 \}= \Pr \{ X_n= \theta \}= \begin{matrix} \frac{1}{n} \end{matrix}$$

Therefore our definition for convergence in probability in this case is:

So for any positive values of $$\epsilon \in \mathbb{R} $$ we can always find an $$N \in \mathbb{N}$$ large enough so that our definition is satisfied. Therefore we have proved that $$X_n \begin{matrix} { }_p \\ \longrightarrow \\{ } \end{matrix} \eta$$.

Convergence Almost Sure
Almost-sure convergence has a marked similarity to convergence in probability, however the conditions for this mode of convergence are stronger; as we will see later, convergence almost surely actually implies that the sequence also converges in probability.

Definition
A sequence of random variables $$\{ X_n ; n=1,2, \cdots \} $$ converges almost surely to the random variable $$X$$ if: equivalently Under these conditions we use the notation $$X_n \begin{matrix} \begin{matrix} { }_{a.s.} \\ \longrightarrow \\{ } \end{matrix} \end{matrix} X$$ or $$\lim_{n \to \infty} X_n = X \operatorname{a.s.}$$.

Example
Let's see if our example from the convergence in probability section also converges almost surely. Defining: $$X_n = \begin{cases} \eta & 1- \begin{matrix} \frac{1}{n} \end{matrix} \\ \theta & \begin{matrix} \frac{1}{n} \end{matrix} \end{cases}$$ we again guess that the convergence is to $$\eta$$. Inspecting the resulting expression we see that: Thereby satisfying our definition of almost-sure convergence.

Convergence in Distribution
Convergence in distribution will appear very frequently in our econometric models through the use of the Central Limit Theorem. So let's define this type of convergence.

Definition
A sequence of random variables $$\{ X_n ; n=1,2, \cdots \} $$ asymptotically converges in distribution to the random variable $$X$$ if $$F_{X_n}(\zeta ) \rightarrow F_{X}(\zeta )$$ for all continuity points. $$F_{X_n}(\zeta )$$ and $$F_{X_{}}(\zeta )$$ are the cumulative density functions of $$X_n$$ and $$X$$ respectively.

It is the distribution of the random variable that we are concerned with here. Think of a students-T distribution: as the degrees of freedom, $$n$$, increases our distribution becomes closer and closer to that of a gaussian distribution. Therefore the random variable $$Y_n \sim t(n)$$ converges in distribution to the random variable $$Y \sim N(0,1)$$ (n.b. we say that the random variable $$Y_n \begin{matrix} { }_{d} \\ \longrightarrow \\{ } \end{matrix} Y$$ as a notational crutch, what we really should use is $$f_{Y_n} (\zeta )\begin{matrix} { }_{d} \\ \longrightarrow \\{ } \end{matrix} f_Y(\zeta )$$/

Example
Let's consider the distribution Xn whose sample space consists of two points, 1/n and 1, with equal probability (1/2). Let X be the binomial distribution with p = 1/2. Then Xn converges in distribution to X.

The proof is simple: we ignore 0 and 1 (where the distribution of X is discontinuous) and prove that, for all other points a, $$\lim F_{X_n}(a) = F_X(a)\,$$. Since for a < 0 all Fs are 0, and for a > 1 all Fs are 1, it remains to prove the convergence for 0 < a < 1. But $$F_{X_n}(a) = \frac{1}{2} ([a \ge \frac{1}{n}] + [a \ge 1])$$ (using Iverson brackets), so for any a chose N > 1/a, and for n > N we have:
 * $$n > 1/a \rightarrow a > 1/n \rightarrow [a \ge \frac{1}{n}] = 1 \land [a \ge 1] = 0 \rightarrow F_{X_n}(a) = \frac{1}{2}\,$$

So the sequence $$F_{X_n}(a)\,$$ converges to $$F_X(a)\,$$ for all points where FX is continuous.

Convergence in R-mean Square
Convergence in R-mean square is not going to be used in this book, however for completeness the definition is provided below.

Definition
A sequence of random variables $$\{ X_n ; n=1,2, \cdots \} $$ asymptotically converges in r-th mean (or in the $$L^r$$ norm) to the random variable $$X$$ if, for any real number $$r>0$$ and provided that $$E(|X_n|^r) < \infty $$ for all n and  $$r\geq 1$$,

$$ \lim_{n\to \infty }E\left( \left\vert X_n-X\right\vert ^r\right) =0. $$

Cramer-Wold Device
The Cramer-Wold device will allow us to extend our convergence techniques for random variables from scalars to vectors.

Definition
A random vector $$\mathbf{X}_n \begin{matrix} { }_{d} \\ \longrightarrow \\{ } \end{matrix} \mathbf{X} \; \iff \; {\mathbf{\lambda}}^{\operatorname{T}}\mathbf{X}_n \begin{matrix} { }_{d} \\ \longrightarrow \\{ } \end{matrix} {\mathbf{\lambda}}^{\operatorname{T}}\mathbf{X} \quad \forall \lVert \mathbf{\lambda} \rVert \ne 0$$.

Central Limit Theorem
Let $$\ X_1, X_2, X_3, ... $$ be a sequence of random variables which are defined on the same probability space, share the same probability distribution D and are independent. Assume that both the expected value μ and the standard deviation σ of D exist and are finite.

Consider the sum $$\ S_n = X_1 + ... + X_n $$. Then the expected value of $$\ S_n $$ is nμ and its standard error is σ n1/2. Furthermore, informally speaking, the distribution of Sn approaches the normal distribution N(nμ,σ2n) as n approaches ∞.