Asymptotics of certain conditionally identically distributed sequences (original) (raw)

Limit theorems for predictive sequences of random variables

2002

A new type of stochastic dependence for a sequence of random variables is introduced and studied. Precisely, (Xn) n≥0 is said to be predictive, with respect to a filtration (Gn) n≥0 such that Gn ⊃ σ(X 0 , . . . , Xn), if X 0 is distributed as X 1 and, for each n ≥ 0, (X k ) k>n is identically distributed given the past Gn. In case Gn = σ(X 0 , . . . , Xn), a result of Kallenberg implies that (Xn) n≥0 is exchangeable if and only if is stationary and predictive. After giving some natural examples of non exchangeable predictive sequences, it is shown that (Xn) n≥0 is exchangeable if and only if (X τ (n) ) n≥0 is predictive for any finite permutation τ of N, and that the distribution of a predictive sequence agrees with an exchangeable law on a certain sub-σ-field. Moreover, (1/n) n−1 k=0 f (X k ) converges a.s. and in L 1 whenever (Xn) n≥0 is predictive and f is a real measurable function such that E[|f (X 0 )|] < ∞. As to the CLT , three types of random centering are considered. One of such centerings, significant in Bayesian prediction and discrete time filtering, is E[f (X n+1 )|Gn]. For each centering, convergence in distribution of the corresponding empirical process is analyzed under uniform distance.

Asymptotics of Predictive Distributions

Advances in intelligent systems and computing, 2016

Let (X n) be a sequence of random variables, adapted to a filtration (G n), and let μ n = (1/n) n i=1 δ X i and a n (•) = P(X n+1 ∈ • | G n) be the empirical and the predictive measures. We focus on μ n − a n = sup B∈D |μ n (B) − a n (B)|, where D is a class of measurable sets. Conditions for μ n − a n → 0, almost surely or in probability, are given. Also, to determine the rate of convergence, the asymptotic behavior of r n μ n − a n is investigated for suitable constants r n. Special attention is paid to r n = √ n. The sequence (X n) is exchangeable or, more generally, conditionally identically distributed.

Rate of convergence of predictive distributions for dependent data

Bernoulli, 2009

This paper deals with empirical processes of the type Cn(B) = √ n{µn(B) − P (Xn+1 ∈ B | X1, . . . , Xn)}, where (Xn) is a sequence of random variables and µn = (1/n) n i=1 δX i the empirical measure. Conditions for sup B |Cn(B)| to converge stably (in particular, in distribution) are given, where B ranges over a suitable class of measurable sets. These conditions apply when (Xn) is exchangeable or, more generally, conditionally identically distributed (in the sense of Berti et al. [Ann. Probab. 32 (2004) 2029-2052]). By such conditions, in some relevant situations, one obtains that sup B |Cn(B)| P → 0 or even that √ n sup B |Cn(B)| converges a.s. Results of this type are useful in Bayesian statistics.

A Central Limit Theorem for Predictive Distributions

Mathematics

Let S be a Borel subset of a Polish space and F the set of bounded Borel functions f:S→R. Let an(·)=P(Xn+1∈·∣X1,…,Xn) be the n-th predictive distribution corresponding to a sequence (Xn) of S-valued random variables. If (Xn) is conditionally identically distributed, there is a random probability measure μ on S such that ∫fdan⟶a.s.∫fdμ for all f∈F. Define Dn(f)=dn∫fdan−∫fdμ for all f∈F, where dn>0 is a constant. In this note, it is shown that, under some conditions on (Xn) and with a suitable choice of dn, the finite dimensional distributions of the process Dn=Dn(f):f∈F stably converge to a Gaussian kernel with a known covariance structure. In addition, Eφ(Dn(f))∣X1,…,Xn converges in probability for all f∈F and φ∈Cb(R).

Uniform rates of the Glivenko–Cantelli convergence and their use in approximating Bayesian inferences

Bernoulli

This paper deals with suitable quantifications in approximating a probability measure by an "empirical" random probability measurepn, depending on the first n terms of a sequence {ξi} i≥1 of random elements. Section 2 studies the range of oscillation near zero of the Wasserstein distance d (p) [S] between p0 andpn, assuming theξi's i.i.d. from p0. In Theorem 2.3 p0 can be fixed in the space of all probability measures on (R d , B(R d)) andpn coincides with the empirical measureẽn := 1 n n i=1 δξ i. In Theorem 2.4 (Theorem 2.5, respectively) p0 is a d-dimensional Gaussian distribution (an element of a distinguished statistical exponential family, respectively) andpn is another d-dimensional Gaussian distribution with estimated mean and covariance matrix (another element of the same family with an estimated parameter, respectively). These new results improve on allied recent works by providing also uniform bounds with respect to n, meaning the finiteness of the p-moment of sup n≥1 bnd (p) [S] (p0,pn) is proved for some diverging sequence bn of positive numbers. In Section 3, assuming thẽ ξi's exchangeable, one studies the range of oscillation near zero of the Wasserstein distance between the conditional distribution-also called posterior-of the directing measure of the sequence, givenξ1,. .. ,ξn, and the point mass atpn. Similarly, a bound for the approximation of predictive distributions is given. Finally, Theorems from 3.3 to 3.5 reconsider Theorems from 2.3 to 2.5, respectively, according to a Bayesian perspective.

A class of models for Bayesian predictive inference

Bernoulli

In a Bayesian framework, to make predictions on a sequence X1, X2,. .. of random observations, the inferrer needs to assign the predictive distributions σn(•) = P Xn+1 ∈ • | X1,. .. , Xn. In this paper, we propose to assign σn directly, without passing through the usual prior/posterior scheme. One main advantage is that no prior probability has to be assessed. The data sequence (Xn) is assumed to be conditionally identically distributed (c.i.d.) in the sense of [4]. To realize this programme, a class Σ of predictive distributions is introduced and investigated. Such a Σ is rich enough to model various real situations and (Xn) is actually c.i.d. if σn belongs to Σ. Furthermore, when a new observation Xn+1 becomes available, σn+1 can be obtained by a simple recursive update of σn. If µ is the a.s. weak limit of σn, conditions for µ to be a.s. discrete are provided as well.

Concentration and Confidence for Discrete Bayesian Sequence Predictors

Lecture Notes in Computer Science, 2013

Bayesian sequence prediction is a simple technique for predicting future symbols sampled from an unknown measure on infinite sequences over a countable alphabet. While strong bounds on the expected cumulative error are known, there are only limited results on the distribution of this error. We prove tight high-probability bounds on the cumulative error, which is measured in terms of the Kullback-Leibler (KL) divergence. We also consider the problem of constructing upper confidence bounds on the KL and Hellinger errors similar to those constructed from Hoeffding-like bounds in the i.i.d. case. The new results are applied to show that Bayesian sequence prediction can be used in the Knows What It Knows (KWIK) framework with bounds that match the state-of-the-art.

Convergence of posterior probabilities in the Bayesian inference strategy

Foundations of Physics, 1985

The formalism of operational statistics, a general&ed approach to probability and statistics, provides a setting within which inference strategies can be studied with great clarity. This paper is concerned with the asymptotic behavior of the Bayesian inference strategy in this setting. We consider a sequence of posterior distributions, obtained from a prior as a result of successive eonditionings by the events of an admissible sequence. We identify certain statistical hypotheses whose limiting posterior probabilities converge to one. We describe these hypotheses, and show that when the prior is vague, they contain those probability models which represent the long-run relative frequencies of occurrence for the events in the sequence.

Limit theorems for a class of identically distributed random variables

The Annals of Probability, 2004

A new type of stochastic dependence for a sequence of random variables is introduced and studied. Precisely, (Xn) n≥1 is said to be conditionally identically distributed (c.i.d.), with respect to a filtration (Gn) n≥0 , if it is adapted to (Gn) n≥0 and, for each n ≥ 0, (X k ) k>n is identically distributed given the past Gn. In case G0 = {∅, Ω} and Gn = σ(X1, . . . , Xn), a result of Kallenberg implies that (Xn) n≥1 is exchangeable if and only if it is stationary and c.i.d. After giving some natural examples of nonexchangeable c.i.d. sequences, it is shown that (Xn) n≥1 is exchangeable if and only if (X τ (n) ) n≥1 is c.i.d. for any finite permutation τ of {1, 2, . . .}, and that the distribution of a c.i.d. sequence agrees with an exchangeable law on a certain sub-σ-field. Moreover, (1/n) n k=1 X k converges a.s. and in L 1 whenever (Xn) n≥1 is (real-valued) c.i.d. and E[|X1|] < ∞. As to the CLT, three types of random centering are considered. One such centering, significant in Bayesian prediction and discrete time filtering, is E[Xn+1|Gn]. For each centering, convergence in distribution of the corresponding empirical process is analyzed under uniform distance. . This reprint differs from the original in pagination and typographic detail. 1