Kolmogorov's inequality (original) (raw)

From Wikipedia, the free encyclopedia

In probability theory, Kolmogorov's inequality is a so-called "maximal inequality" that gives a bound on the probability that the partial sums of a finite collection of independent random variables exceed some specified bound.

Statement of the inequality

[edit]

Let _X_1, ..., X n : Ω → R be independent random variables defined on a common probability space (Ω, F, Pr), with expected value E[X _k_] = 0 and variance Var[X _k_] < +∞ for _k_ = 1, ..., _n_. Then, for each λ > 0,

Pr ( max 1 ≤ k ≤ n | S k | ≥ λ ) ≤ 1 λ 2 Var ⁡ [ S n ] ≡ 1 λ 2 ∑ k = 1 n Var ⁡ [ X k ] = 1 λ 2 ∑ k = 1 n E [ X k 2 ] , {\displaystyle \Pr \left(\max _{1\leq k\leq n}|S_{k}|\geq \lambda \right)\leq {\frac {1}{\lambda ^{2}}}\operatorname {Var} [S_{n}]\equiv {\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}\operatorname {Var} [X_{k}]={\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}{\text{E}}[X_{k}^{2}],} {\displaystyle \Pr \left(\max _{1\leq k\leq n}|S_{k}|\geq \lambda \right)\leq {\frac {1}{\lambda ^{2}}}\operatorname {Var} [S_{n}]\equiv {\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}\operatorname {Var} [X_{k}]={\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}{\text{E}}[X_{k}^{2}],}

where S k = _X_1 + ... + X k.

The convenience of this result is that we can bound the worst case deviation of a random walk at any point of time using its value at the end of time interval.

The following argument employs discrete martingales. As argued in the discussion of Doob's martingale inequality, the sequence S 1 , S 2 , … , S n {\displaystyle S_{1},S_{2},\dots ,S_{n}} {\displaystyle S_{1},S_{2},\dots ,S_{n}} is a martingale. Define ( Z i ) i = 0 n {\displaystyle (Z_{i})_{i=0}^{n}} {\displaystyle (Z_{i})_{i=0}^{n}} as follows. Let Z 0 = 0 {\displaystyle Z_{0}=0} {\displaystyle Z_{0}=0}, and

Z i + 1 = { S i + 1 if max 1 ≤ j ≤ i | S j | < λ Z i otherwise {\displaystyle Z_{i+1}=\left\{{\begin{array}{ll}S_{i+1}&{\text{ if }}\displaystyle \max _{1\leq j\leq i}|S_{j}|<\lambda \\Z_{i}&{\text{ otherwise}}\end{array}}\right.} {\displaystyle Z_{i+1}=\left\{{\begin{array}{ll}S_{i+1}&{\text{ if }}\displaystyle \max _{1\leq j\leq i}|S_{j}|<\lambda \\Z_{i}&{\text{ otherwise}}\end{array}}\right.}

for all i {\displaystyle i} {\displaystyle i}. Then ( Z i ) i = 0 n {\displaystyle (Z_{i})_{i=0}^{n}} {\displaystyle (Z_{i})_{i=0}^{n}} is also a martingale.

For any martingale M i {\displaystyle M_{i}} {\displaystyle M_{i}} with M 0 = 0 {\displaystyle M_{0}=0} {\displaystyle M_{0}=0}, we have that

∑ i = 1 n E [ ( M i − M i − 1 ) 2 ] = ∑ i = 1 n E [ M i 2 − 2 M i M i − 1 + M i − 1 2 ] = ∑ i = 1 n E [ M i 2 − 2 ( M i − 1 + M i − M i − 1 ) M i − 1 + M i − 1 2 ] = ∑ i = 1 n E [ M i 2 − M i − 1 2 ] − 2 E [ M i − 1 ( M i − M i − 1 ) ] = E [ M n 2 ] − E [ M 0 2 ] = E [ M n 2 ] . {\displaystyle {\begin{aligned}\sum _{i=1}^{n}{\text{E}}[(M_{i}-M_{i-1})^{2}]&=\sum _{i=1}^{n}{\text{E}}[M_{i}^{2}-2M_{i}M_{i-1}+M_{i-1}^{2}]\\&=\sum _{i=1}^{n}{\text{E}}\left[M_{i}^{2}-2(M_{i-1}+M_{i}-M_{i-1})M_{i-1}+M_{i-1}^{2}\right]\\&=\sum _{i=1}^{n}{\text{E}}\left[M_{i}^{2}-M_{i-1}^{2}\right]-2{\text{E}}\left[M_{i-1}(M_{i}-M_{i-1})\right]\\&={\text{E}}[M_{n}^{2}]-{\text{E}}[M_{0}^{2}]={\text{E}}[M_{n}^{2}].\end{aligned}}} {\displaystyle {\begin{aligned}\sum _{i=1}^{n}{\text{E}}[(M_{i}-M_{i-1})^{2}]&=\sum _{i=1}^{n}{\text{E}}[M_{i}^{2}-2M_{i}M_{i-1}+M_{i-1}^{2}]\\&=\sum _{i=1}^{n}{\text{E}}\left[M_{i}^{2}-2(M_{i-1}+M_{i}-M_{i-1})M_{i-1}+M_{i-1}^{2}\right]\\&=\sum _{i=1}^{n}{\text{E}}\left[M_{i}^{2}-M_{i-1}^{2}\right]-2{\text{E}}\left[M_{i-1}(M_{i}-M_{i-1})\right]\\&={\text{E}}[M_{n}^{2}]-{\text{E}}[M_{0}^{2}]={\text{E}}[M_{n}^{2}].\end{aligned}}}

Applying this result to the martingale ( S i ) i = 0 n {\displaystyle (S_{i})_{i=0}^{n}} {\displaystyle (S_{i})_{i=0}^{n}}, we have

Pr ( max 1 ≤ i ≤ n | S i | ≥ λ ) = Pr [ | Z n | ≥ λ ] ≤ 1 λ 2 E [ Z n 2 ] = 1 λ 2 ∑ i = 1 n E [ ( Z i − Z i − 1 ) 2 ] ≤ 1 λ 2 ∑ i = 1 n E [ ( S i − S i − 1 ) 2 ] = 1 λ 2 E [ S n 2 ] = 1 λ 2 Var [ S n ] {\displaystyle {\begin{aligned}{\text{Pr}}\left(\max _{1\leq i\leq n}|S_{i}|\geq \lambda \right)&={\text{Pr}}[|Z_{n}|\geq \lambda ]\\&\leq {\frac {1}{\lambda ^{2}}}{\text{E}}[Z_{n}^{2}]={\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(Z_{i}-Z_{i-1})^{2}]\\&\leq {\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(S_{i}-S_{i-1})^{2}]={\frac {1}{\lambda ^{2}}}{\text{E}}[S_{n}^{2}]={\frac {1}{\lambda ^{2}}}{\text{Var}}[S_{n}]\end{aligned}}} {\displaystyle {\begin{aligned}{\text{Pr}}\left(\max _{1\leq i\leq n}|S_{i}|\geq \lambda \right)&={\text{Pr}}[|Z_{n}|\geq \lambda ]\\&\leq {\frac {1}{\lambda ^{2}}}{\text{E}}[Z_{n}^{2}]={\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(Z_{i}-Z_{i-1})^{2}]\\&\leq {\frac {1}{\lambda ^{2}}}\sum _{i=1}^{n}{\text{E}}[(S_{i}-S_{i-1})^{2}]={\frac {1}{\lambda ^{2}}}{\text{E}}[S_{n}^{2}]={\frac {1}{\lambda ^{2}}}{\text{Var}}[S_{n}]\end{aligned}}}

where the first inequality follows by Chebyshev's inequality.

This inequality was generalized by Hájek and Rényi in 1955.

This article incorporates material from Kolmogorov's inequality on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.