Comentarii Adauga Comentariu

Legea numerelor mari

Exemplificarea a manifestării Legii Numerelor Mari: Grafic care ne arată cum tind aruncările stemă-ban să se egalizeze pe măsură ce executăm tot mai multe aruncări (ex. pe 10 seturi a 100 de aruncări).

Let X be a real-valued random variable, and let X_1, X_2, X_3, ... be an infinite sequence of independent and identically distributed copies of X. Let \overline{X}_n := \frac{1}{n}(X_1 + \ldots + X_n) be the empirical averages of this sequence. A fundamental theorem in probability theory is the law of large numbers, which comes in both a weak and a strong form:

Weak law of large numbers. Suppose that the first moment {\Bbb E} |X| of X is finite. Then \overline{X}_n converges in probability to {\Bbb E} X, thus \lim_{n \to \infty} {\Bbb P}( |\overline{X}_n - {\Bbb E} X| \geq \varepsilon ) = 0 for every \varepsilon > 0.

Strong law of large numbers. Suppose that the first moment {\Bbb E} |X| of X is finite. Then \overline{X}_n converges almost surely to {\Bbb E} X, thus {\Bbb P}( \lim_{n \to \infty} \overline{X}_n = {\Bbb E} X ) = 1.

[The concepts of convergence in probability and almost sure convergence in probability theory are specialisations of the concepts of convergence in measure and pointwise convergence almost everywhere in measure theory.]

(If one strengthens the first moment assumption to that of finiteness of the second moment {\Bbb E}|X|^2, then we of course have a more precise statement than the (weak) law of large numbers, namely the central limit theorem, but I will not discuss that theorem here.  With even more hypotheses on X, one similarly has more precise versions of the strong law of large numbers, such as the Chernoff inequality, which I will again not discuss here.)

The weak law is easy to prove, but the strong law (which of course implies the weak law, by Egoroff’s theorem) is more subtle, and in fact the proof of this law (assuming just finiteness of the first moment) usually only appears in advanced graduate texts. So I thought I would present a proof here of both laws, which proceeds by the standard techniques of the moment method and truncation. The emphasis in this exposition will be on motivation and methods rather than brevity and strength of results; there do exist proofs of the strong law in the literature that have been compressed down to the size of one page or less, but this is not my goal here.

— The moment method —

The moment method seeks to control the tail probabilities of a random variable (i.e. the probability that it fluctuates far from its mean) by means of moments, and in particular the zeroth, first or second moment. The reason that this method is so effective is because the first few moments can often be computed rather precisely. The first moment method usually employs Markov’s inequality

\displaystyle {\Bbb P}( |X| \geq \lambda ) \leq \frac{1}{\lambda} {\Bbb E} |X| (1)

(which follows by taking expectations of the pointwise inequality \lambda I(|X| \geq \lambda) \leq |X|), whereas the second moment method employs some version of Chebyshev’s inequality, such as

\displaystyle {\Bbb P}( |X| \geq \lambda ) \leq \frac{1}{\lambda^2} {\Bbb E} |X|^2 (2)

(note that (2) is just (1) applied to the random variable |X|^2 and to the threshold \lambda^2).

Generally speaking, to compute the first moment one usually employs linearity of expectation

\displaystyle {\Bbb E} X_1 + \ldots + X_n = {\Bbb E} X_1 + \ldots + {\Bbb E} X_n,

whereas to compute the second moment one also needs to understand covariances (which are particularly simple if one assumes pairwise independence), thanks to identities such as

\displaystyle {\Bbb E} (X_1 + \ldots + X_n)^2 = {\Bbb E} X_1^2 + \ldots + {\Bbb E} X_n^2 + 2 \sum_{1 \leq i < j \leq n} X_i X_j

or the normalised variant

\displaystyle {\bf Var}(X_1+\ldots+X_n) = {\bf Var}(X_1) + \ldots + {\bf Var}(X_n)

\displaystyle + 2 \sum_{1 \leq i < j \leq n} {\bf Cov}(X_i,X_j). (3)

Higher moments can in principle give more precise information, but often require stronger assumptions on the objects being studied, such as joint independence.

Here is a basic application of the first moment method:

Borel-Cantelli lemma. Let E_1, E_2, E_3, \ldots be a sequence of events such that \sum_{n=1}^\infty {\Bbb P}(E_n) is finite. Then almost surely, only finitely many of the events E_n are true.

Proof. Let I(E_n) denote the indicator function of the event E_n. Our task is to show that \sum_{n=1}^\infty I(E_n) is almost surely finite. But by linearity of expectation, the expectation of this random variable is \sum_{n=1}^\infty {\Bbb P}(E_n), which is finite by hypothesis. By Markov’s inequality (1) we conclude that

\displaystyle {\Bbb P}( \sum_{n=1}^\infty I(E_n) \geq \lambda ) \leq \frac{1}{\lambda} \sum_{n=1}^\infty {\Bbb P}(E_n).

Letting \lambda \to \infty we obtain the claim. \Box

Returning to the law of large numbers, the first moment method gives the following tail bound:

Lemma 1. (First moment tail bound) If {\Bbb E}|X| is finite, then

\displaystyle {\Bbb P}( |\overline{X}_n| \geq \lambda ) \leq \frac{{\Bbb E}|X|}{\lambda}.

Proof. By the triangle inequality, |\overline{X}_n| \leq \overline{|X|}_n. By linearity of expectation, the expectation of \overline{|X|}_n is {\Bbb E}|X|. The claim now follows from Markov’s inequality. \Box

Lemma 1 is not strong enough by itself to prove the law of large numbers in either weak or strong form – in particular, it does not show any improvement as n gets large – but it will be useful to handle one of the error terms in those proofs.

We can get stronger bounds than Lemma 1 – in particular, bounds which improve with n – at the expense of stronger assumptions on X.

Lemma 2. (Second moment tail bound) If {\Bbb E}|X|^2 is finite, then

\displaystyle {\Bbb P}( |\overline{X}_n - {\Bbb E}(X)| \geq \lambda ) \leq \frac{ {\Bbb E}|X - {\Bbb E}(X)|^2 }{n \lambda^2}.

Proof. A standard computation, exploiting (3) and the pairwise independence of the X_i, shows that the variance {\Bbb E} |\overline{X}_n - {\Bbb E}(X)|^2 of the empirical averages \overline{X}_n is equal to \frac{1}{n} times the variance {\Bbb E} |X - {\Bbb E}(X)|^2 of the original variable X. The claim now follows from Chebyshev’s inequality (2). \Box

In the opposite direction, there is the zeroth moment method, more commonly known as the union bound

\displaystyle {\Bbb P}( E_1 \vee \ldots \vee E_n ) \leq \sum_{j=1}^n {\Bbb P}(E_j)

or equivalently (to explain the terminology “zeroth moment”)

\displaystyle {\Bbb E} (X_1 + \ldots + X_n)^0 \leq {\Bbb E} X_1^0 + \ldots + X_n^0

for any non-negative random variables X_1,\ldots,X_n \geq 0. Applying this to the empirical means, we obtain the zeroth moment tail estimate

{\Bbb P} (\overline{X}_n \neq 0) \leq n {\Bbb P}(X \neq 0). (4)

Just as the second moment bound (Lemma 2) is only useful when one has good control on the second moment (or variance) of X, the zeroth moment tail estimate (3) is only useful when we have good control on the zeroth moment {\Bbb E} |X|^0 = {\Bbb P}(X \neq 0), i.e. when X is mostly zero.

— Truncation —

The second moment tail bound (Lemma 2) already gives the weak law of large numbers in the case when X has finite second moment (or equivalently, finite variance). In general, if all one knows about X is that it has finite first moment, then we cannot conclude that X has finite second moment. However, we can perform a truncation

\displaystyle X = X_{\leq N} + X_{>N} (5)

of X at any desired threshold N, where X_{\leq N} := X I(|X| \leq N) and X_{>N} := X I(|X| > N). The first term X_{\leq N} has finite second moment; indeed we clearly have

\displaystyle {\Bbb E} |X_{\leq N}|^2 \leq N {\Bbb E} |X|

and hence also we have finite variance

\displaystyle {\Bbb E} |X_{\leq N} - {\Bbb E} X_{\leq N}|^2 \leq N {\Bbb E} |X|. (6)

The second term X_{>N} may have infinite second moment, but its first moment is well controlled. Indeed, by the monotone convergence theorem, we have

\displaystyle {\Bbb E} |X_{>N}| \to 0 \hbox{ as } N \to \infty. (7)

By the triangle inequality, we conclude that the first term X_{\leq N} has expectation close to {\Bbb E} X:

\displaystyle {\Bbb E} X_{\leq N} \to {\Bbb E}(X) \hbox{ as } N \to \infty. (8)

These are all the tools we need to prove the weak law of large numbers:

Proof of weak law. Let \varepsilon > 0. It suffices to show that whenever n is sufficiently large depending on \varepsilon, that \overline{X}_n = {\Bbb E} X + O(\varepsilon) with probability 1-O(\varepsilon).

From (7), (8), we can find a threshold N (depending on \varepsilon) such that {\Bbb E} |X_{\geq N}| = O(\varepsilon^2) and {\Bbb E} X_{<N} = {\Bbb E} X + O(\varepsilon). Now we use (5) to split

\displaystyle \overline{X}_n = (\overline{X_{\geq N}})_n +(\overline{X_{< N}})_n.

From the first moment tail bound (Lemma 1), we know that (\overline{X_{\geq N}})_n = O(\varepsilon) with probability 1 - O(\varepsilon). From the second moment tail bound (Lemma 2) and (6), we know that (\overline{X_{< N}})_n = {\Bbb E} X_{<N} + O(\varepsilon) = {\Bbb E} X + O(\varepsilon) with probability 1-O(\varepsilon) if n is sufficiently large depending on N and \varepsilon. The claim follows. \Box

— The strong law —

The strong law can be proven by pushing the above methods a bit further, and using a few more tricks.

The first trick is to observe that to prove the strong law, it suffices to do so for non-negative random variables X \geq 0. Indeed, this follows immediately from the simple fact that any random variable X with finite first moment can be expressed as the difference of two non-negative random variables \max(X,0), \max(-X,0) of finite first moment.

Once X is non-negative, we see that the empirical averages \overline{X}_n cannot decrease too quickly in n. In particular we observe that

\displaystyle \overline{X}_m \leq (1+O(\varepsilon)) \overline{X}_n whenever (1-\varepsilon) n \leq m \leq n. (9)

Because of this quasimonotonicity, we can sparsify the set of n for which we need to prove the strong law. More precisely, it suffices to show

Strong law of large numbers, reduced version. Let X be a non-negative random variable with {\Bbb E} X < \infty, and let 1 \leq n_1\leq n_2\leq n_3\leq\ldots be a sequence of integers which is lacunary in the sense that n_{j+1}/n_j > c for some c>1 and all sufficiently large j. Then \overline{X}_{n_j} converges almost surely to {\Bbb E} X.

Indeed, if we could prove the reduced version, then on applying that version to the lacunary sequence n_j := \lfloor (1 + \varepsilon)^j\rfloor and using (9) we would see that almost surely the empirical means \overline{X}_n cannot deviate by more than a multiplicative error of 1+O(\varepsilon) from the mean {\Bbb E} X. Setting \varepsilon := 1/m for m=1,2,3,\ldots (and using the fact that a countable intersection of almost sure events remains almost sure) we obtain the full strong law.

[This sparsification trick is philosophically related to the dyadic pigeonhole principle philosophy; see an old short story of myself on this latter topic. One could easily sparsify further, so that the lacunarity constant c is large instead of small, but this turns out not to help us too much in what follows.]

Now that we have sparsified the sequence, it becomes economical to apply the Borel-Cantelli lemma. Indeed, by many applications of that lemma we see that it suffices to show that

\displaystyle \sum_{j=1}^\infty {\Bbb P}( \overline{X}_{n_j} \neq {\Bbb E}(X) + O( \varepsilon ) ) < \infty (10)

for non-negative X of finite first moment, any lacunary sequence 1 \leq n_1 \leq n_2 \leq \ldots and any \varepsilon > 0. [This is a slight abuse of the O() notation, but it should be clear what is meant by this.]

[If we did not first sparsify the sequence, the Borel-Cantelli lemma would have been too expensive to apply; see Remark 2 below. Generally speaking, Borel-Cantelli is only worth applying when one expects the events E_n to be fairly “disjoint” or “independent” of each other; in the non-lacunary case, the events E_n change very slowly in n, which makes the lemma very inefficient. We will not see how lacunarity is exploited until the punchline at the very end of the proof, but certainly there is no harm in taking advantage of this “free” reduction to the lacunary case now, even if it is not immediately clear how it will be exploited.]

At this point we go back and apply the methods that already worked to give the weak law. Namely, to estimate each of the tail probabilities {\Bbb P}( \overline{X}_{n_j} \neq {\Bbb E}(X) + O(\varepsilon) ), we perform a truncation (5) at some threshold N_j. It is not immediately obvious what truncation to perform, so we adopt the usual strategy of leaving N_j unspecified for now and optimising in this parameter later.

We should at least pick N_j large enough so that {\Bbb E} X_{< N_j} = {\Bbb E} X + O(\varepsilon). From the second moment tail estimate (Lemma 2) we conclude that (\overline{X_{< N_j}})_{n_j} is also equal to {\Bbb E} X + O( \varepsilon ) with probability 1-O( \frac{1}{\varepsilon n_j} {\Bbb E} |X_{\leq N_j}|^2 ). One could attempt to simplify this expression using (6), but this turns out to be a little wasteful, so let us hold off on that for now. However, (6) does strongly suggest that we want to take N_j to be something like n_j, which is worth keeping in mind in what follows.

Now we look at the contribution of X_{\geq N_j}. One could use the first moment tail estimate (Lemma 1), but it turns out that the first moment {\Bbb E} X_{> N_j} decays too slowly in j to be of much use (recall that we are expecting N_j to be like the lacunary sequence n_j); the root problem here is that the decay (7) coming from the monotone convergence theorem is ineffective (one could effectivise this using the finite convergence principle, but this turns out to give very poor results here).

But there is one last card to play, which is the zeroth moment method tail estimate (4). As mentioned earlier, this bound is lousy in general – but is very good when X is mostly zero, which is precisely the situation with X_{>N_j}. and in particular we see that (\overline{X_{>N_j}})_{n_j} is zero with probability 1 - O( n_j {\Bbb P}(X > N_j) ).

Putting this all together, we see that

\displaystyle {\Bbb P}( \overline{X}_{n_j} \neq {\Bbb E}(X) + O( \varepsilon ) ) \leq O( \frac{1}{\varepsilon n_j} {\Bbb E} |X_{\leq N_j}|^2 ) + O( n_j {\Bbb P}(X > N_j) ).

Summing this in j, we see that we will be done as soon as we figure out how to choose N_j so that

\displaystyle \sum_{j=1}^\infty \frac{1}{n_j} {\Bbb E} |X_{\leq N_j}|^2 (11)


\displaystyle \sum_{j=1}^\infty n_j {\Bbb P}(X > N_j) (12)

are both finite. (As usual, we have a tradeoff: making the N_j larger makes (12) easier to establish at the expense of (11), and vice versa when making N_j smaller.)

Based on the discussion earlier, it is natural to try setting N_j := n_j. Happily, this choice works cleanly; the lacunary nature of n_j ensures (basically from the geometric series formula) that we have the pointwise estimates

\displaystyle \sum_{j=1}^\infty \frac{1}{n_j} |X_{\leq n_j}|^2 = O( X )


\displaystyle \sum_{j=1}^\infty n_j I( X \geq n_j ) = O( X )

(where the implied constant here depends on the sequence n_1, n_2, \ldots, and in particular on the lacunarity constant c). The claims (10), (11) then follow from one last application of linearity of expectation, giving the strong law of large numbers.

Remark 1. The above proof in fact shows that the strong law of large numbers holds even if one only assumes pairwise independence of the X_n, rather than joint independence. \diamond

Remark 2. It is essential that the random variables X_1,X_2,\ldots are “recycled” from one empirical average \overline{X}_n to the next, in order to get the crucial quasimonotonicity property (9). If instead we took completely independent averages \overline{X}_n = \frac{1}{n} (X_{n,1} + \ldots + X_{n,n} ), where the X_{i,j} are all iid, then the strong law of large numbers in fact breaks down with just a first moment assumption. (For a counterexample, consider a random variable X which equals 2^m / m^2 with probability 2^{-m} for m=1,2,3,\ldots; this random variable (barely) has finite first moment, but for n \sim 2^m/m^2, we see that \overline{X}_n deviates by at least absolute constant from its mean with probability \gg 1/m^2. As the empirical means \overline{X}_n for n \sim 2^m/m^2 are now jointly independent, the probability that one of them deviates significantly is now extremely close to 1 (super-exponentially close in m, in fact), leading to the total failure of the strong law in this setting.) Of course, if one restricts attention to a lacunary sequence of n then the above proof goes through in the independent case (since the Borel-Cantelli lemma is insensitive to this independence). By exploiting the joint independence further (e.g. by using Chernoff’s inequality) one can also get the strong law for independent empirical means for the full sequence n under second moment bounds. \diamond

Remark 3. From the perspective of interpolation theory, one can view the above argument as an interpolation argument, establishing an L^1 estimate (10) by interpolating between an L^2 estimate (Lemma 2) and the L^0 estimate (4). \diamond

Remark 4. By viewing the sequence X_1,X_2,\ldots as a stationary process, and thus as a special case of a measure-preserving system one can view the weak and strong law of large numbers as special cases of the mean and pointwise ergodic theorems respectively (see Exercise 9 from 254A Lecture 8 and Theorem 2 from 254A Lecture 9).  \diamond

Sursa terrytao.wordpress.com


Linkul direct catre Petitie

CEREM NATIONALIZAREA TUTUROR RESURSELOR NATURALE ALE ROMANIEI ! - Initiativa Legislativa care are nevoie de 500.000 de semnaturi - Semneaza si tu !


Adauga Comentariu

Citiți și cele mai căutate articole de pe Fluierul:

Florin Cîțu acuză: Vizita lui Dăncilă în SUA e privată. Asta înseamnă deturnare de fonduri și închisoare

Viktor Orban: În Ungaria a fost construit statul democrat-creștin

Iohannis, la Craiova: M-am așteptat să găsesc pe undeva o bucată de drum rapid sau autostradă

50 de ani de minciuni climatologice progresiste

Franța: Poliția a folosit grenade cu gaze lacrimogene contra 'vestelor galbene' și manifestanților ecologiști

O nouă retragere din cursa pentru Cotroceni: "Nu vreau să intru în competiție prin fraudă"

Reportaj realizat de istoricul Marius Oprea: Sfinții de pe strada Crizantemelor/ Arheologia crimelor comunismului la Tîrgu Ocna

Ficțiune devenită realitate: Un actor și-a tăiat gâtul în timpul unei piese de teatru, după ce cuțitul din recuzită s-a dovedit a fi real

Conservatorii continuă să conducă în sondajele din Marea Britanie

Caz tragic în Prahova: Un bărbat a murit în urma unui incendiu izbucnit de la o țigară uitată aprinsă

Gest de romantism cu final tragic: Un bărbat s-a înecat în timp ce încerca să își ceară iubita în căsătorie sub apă | VIDEO

Cum s-a trăit în România în anii '70- '80? UNII VĂ MINT spunându-vă că s-a trăit permanent prost între anii 1965 - 1989.

Tenis: Federer și Nadal au adus Europa în avantaj în Laver Cup


Tensiunile din Orientul Mijlociu. Arabia Saudită va considera că este un "act de război" dacă Iranul va fi găsit vinovat

Perechea Simona Halep/ Raluca Olaru s-a calificat în turul 2 la WTA Wuhan

Avionul Egyptair prăbuşit: Analiza uneia din cutiile negre arată că s-a pronunţat cuvântul "foc"

Detalii despre prinderea evadatului de la Penitenciarul Focșani: Bărbatul a atacat o femeie în propria ei casă și i-a furat banii

Siria: Autoritățile anunță doborârea unei drone înarmate în sudul țării

INS: 98,2% dintre salariați lucrează în baza unui contract de muncă pe perioadă nedeterminată

Incident produs în timpul nopții în Drobeta Turnu-Severin: Anvelope tăiate, geamuri și oglinzi distruse la trei mașini

Un vagon al unui tren de marfă, încărcat cu combustibil, a deraiat în județul Constanța


Yemen: Riadul reacționează cu prudență la propunerea rebelilor houthi de a înceta atacurile

Dăncilă: Nu suntem un guvern eșuat, suntem un guvern eficient

Cătălin Ivan și-a depus la BEC candidatura pentru prezidențiale. Câte semnături a strâns candidatul Alternativei pentru Demnitate Națională | VIDEO

Barcelona a pierdut cu 2-0 meciul cu Granada, din campionatul Spaniei

Incendiu devastator: Un depozit cu 2.000 de tone de cereale a ars în totalitate/ Pompierii intervin pentru stingerea ultimelor focare | FOTO, VIDEO

Jigniri fără precedent după meciul Atletico Madrid - Juventus. Ronaldo i-a atacat pe suporterii lui Atletico: Prea mulți oameni proști

Adevărul despre vitejia românilor. Ce spuneau străinii de acum 500 de ani despre curajul strămoșilor noștri pe câmpul de luptă

Încă un candidat eliminat din cursa pentru prezidențiale. Pleșoianu nu a obținut semnăturile suficiente: O să fiu de 100 de ori mai vocal

Gelu Voican Voiculescu, agresat de un vecin în propria locuință. Motivul ar fi fost gălăgia din apartamentul fostului vicepremier

Urs agresiv într-o localitate din Bistrița-Năsăud: Populația a fost avertizată prin sistemul Ro-Alert


Ciclism: Eduard Grosu a încheiat Turul Slovaciei pe locul 20

Handbal feminin: SCM Craiova și Măgura Cisnădie, victorioase în deplasare, în Liga Națională

O mică victorie a activiștilor: Aproape 90 de mari companii ale lumii se angajează să-și reducă emisiile de gaze cu efect de seră

Atac armat simulat în ambasada Israelului: Bucureștenii, avertizați prin Ro-Alert/ Ce urmăresc specialiștii SRI | VIDEO

Arad: Un depozit în care se aflau 2.000 de tone de cereale a ars complet într-un incendiu

Razie în una dintre cele mai mari piețe din Capitală: ANPC efectueaza acum un control în Piața Obor | FOTO, VIDEO

Loteria Română: Reporturi de peste 3,4 milioane de euro la Joker și de peste 1,6 milioane de euro la Loto 6/49

Ziua Mărțișorului ar putea deveni zi de sărbătoare națională. Proiectul de lege, depus la Senat

Noi detalii despre starea de sănătate a lui Gabi Balint. Ilie Dumitrescu: Mi-a dat mesaj/ Ce i-a scris fostul internațional, după infarctul suferit

Accident teribil în Buzău: Două persoane au murit, iar alte două au fost rănite

Lupte: România a încheiat Mondialele din Kazahstan cu o singură medalie

Un cutremur de 5,6 grade pe scara Richter a avut loc sâmbătă în Albania

Tenis: Nicholas David Ionel va juca finala turneului ITF de la Târgu Mureș

Tenis: Christopher O'Connell și Danilo Petrovic, în finala turneului challenger Sibiu Open

Recep Tayyip Erdogan: Turcia este pregătită să intervină militar la granița cu Siria

Tenis: Ana Bogdan și Jaqueline Cristian, în ultimul tur al calificărilor la Tașkent (WTA)

Dâmbovița: Fata de 11 ani dispărută a fost găsită decedată

Un exercițiu antitero organizat de SRI se desfășoară la ambasada Israelului

Camera de Comerț Americană în România se distanțează de vizita lui Dăncilă în SUA: Nu suntem implicați în organizarea evenimentului

Ea este Ivona, fiica lui "Garcea". La 18 ani este bombă SEXY FOTO

ANPIS: Peste 1,1 miliarde lei, plătite în august 2019 pentru principalele beneficii de asistență socială

Palatul Parlamentului, evaluat de Guvern: Cât valorează cea mai mare clădire administrativă din lume

Tenis: Japoneza Naomi Osaka a cucerit trofeul la Osaka (WTA)

Alexandru Cumpănașu, Ramona Bruynseels și Cătălin Ivan își depun duminică la BEC candidaturile pentru alegerile prezidențiale

Tenis: Andreea Mitu, campioană la dublu și calificată în finala de simplu la Arad (ITF)

Hochei: SC Miercurea Ciuc și ACSH Gheorgheni, victorioase în Erste Liga

Pag.1 Pag.2 Pag.3 Pag.4 Pag.5 Pag.6 Pag.7
Pag.8 Pag.9 Pag.10 Pag.11

Nr. de articole la aceasta sectiune: 647, afisate in 11 pagini.

ieri 04:57 CITATUL ZILEI