A new proof of convergence of MCMC via the ergodic theorem (original) (raw)

The Markov Renewal Theorem and Related Results

We give a new probabilistic proof of the Markov renewal theorem for Markov random walks with positive drift and Harris recurrent driving chain. It forms an alternative to the one recently given in [1] and follows more closely the probabilistic proofs provided for Blackwell's theorem in the literature by making use of ladder variables, the stationary Markov delay distribution and a coupling argument. A major advantage is that the arguments can be refined to yield convergence rate results.

On the Markov Renewal Theorem ( Corrected version )

Let (S, S) be a measurable space with countably generated σ-field S and (M n , X n) n≥0 a Markov chain with state space S ×IR and transition kernel IP : S × (S ⊗ B) → [0, 1]. Then (M n , S n) n≥0 , where S n = X 0 + ... + X n for n ≥ 0, is called the associated Markov random walk. Markov renewal theory deals with the asymptotic behavior of suitable functionals of (M n , S n) n≥0 like the Markov renewal measure n≥0 P ((M n , S n) ∈ A × (t + B)) as t → ∞ where A ∈ S and B denotes a Borel subset of IR. It is shown that the Markov renewal theorem as well as a related ergodic theorem for semi-Markov processes hold true if only Harris recurrence of (M n) n≥0 is assumed. This was proved by purely analytical methods by Shurenkov [16] in the one-sided case where IP (x, [0, ∞)) = 1 for all x ∈ S. Our proof uses probabilistic arguments, notably the construction of regeneration epochs for (M n) n≥0 such that (M n , X n) n≥0 is at least nearly regenerative and an extension of Blackwell's ren...

On the Markov renewal theorem

Stochastic Processes and their Applications, 1994

EA X (t +B)) as t-t% where A E 9 and B denotes a Bore1 subset of iw. It is shown that the Markov renewal theorem as well as a related ergodic theorem for semi-Markov processes hold true if only Harris recurrence of (&I,),,, is assumed. This was proved by purely analytical methods by Shurenkov [ 15 ] in the one-sided case where $(x, S X [ 0, a)) = 1 for all 1~s. Our proof uses probabilistic arguments, notably the construction of regeneration epochs for (M,),,, such that (%X,.

Uniform limit theorems for Harris recurrent Markov chains

Probability Theory and Related Fields, 1988

We study uniform limit theorems for regenerative processes and get strong law of large numbers and central limit theorem of this type. Then we apply those results to Harris recurrent Markov chains based on some ideas of K. Athreya, P. Ney and E. Nummelin.

On ergodicity of some Markov processes

The Annals of Probability, 2010

We formulate a criterion for the existence, uniqueness of an invariant measure for a Markov process taking values in a Polish phase space. In addition, the weak * ergodicity, that is, the weak convergence of the ergodic averages of the laws of the process starting with any initial distribution, is established. The principal assumptions are the lower bound of the ergodic averages of the transition probability function and the eproperty of the semigroup. The general result is applied to solutions of some stochastic evolution equations in Hilbert spaces. As an example we consider an evolution equation whose solution describes the Lagrangian observations of the velocity field in the passive tracer model. The weak * mean ergodicity of the respective invariant measure is used to derive the law of large numbers for the trajectory of a tracer.

Erratum to: Extremal indices, geometric ergodicity of Markov chains, and MCMC

Extremes, 2010

We investigate the connections between extremal indices on the one hand and stability of Markov chains on the other hand. Both theories relate to the tail behaviour of stochastic processes, and we find a close link between the extremal index and geometric ergodicity. Our results are illustrated throughout with examples from simple MCMC chains.

Ergodic theorems and ergodic decomposition for Markov chains

1998

This paper considers Markov chains on a locally compact separable metric space, which have an invariant probability measure but with no other assumption on the transition kernel. Within this context, the limit provided by several ergodic theorems is explicitly identified in terms of the limit of the expected occupation measures. We also extend Yosida's 'ergodic' decomposition for Fellerlike kernels to arbitrary kernels, and present ergodic results for empirical occupation measures, as well as for additive-noise systems.

Notes on General State Space Markov Chains and MCMC Algorithms

2020

Note this is a random variable with expected value π(f) (i.e. the estimator is unbiased) and standard deviation of order O(1/ √ N). Then by CLT, the errorπ(f) − π(f) will have a limiting normal distribution as N → ∞. Therefore we can compute π(f) by computing samples (plus some regression techniques?). But the problem is if π u is complicated, then it is very difficult to simulate i.i.d. random variables from π(•). The MCMC solution is to construct a Markov chain on X which has π(•) as a stationary distribution, i.e. X π(dx)P (x, dy) = π(dy) Then for large n the distribution of X n will be approximately stationary. We can set Z 1 = X n and get Z 2 , Z 3 ,. .. , Z n repeatedly. Remark. In practice instead of starting a fresh Markov chain every time we take the successive X n 's, for example, (N − B) −1 N i=B+1 f (X i). We tend to ignore the dependence problem as many of the mathematical issues are similar in either implementation. Remark. We have other ways of estimation, such as "rejection sampling" and "importance sampling". But MCMC algorithms is applied most widely. 2 MCMC and its construction This section will explain how MCMC algorithm is constructed. Now we introduce reversibility. Definition. A Markov Chain on state space X is reversible with respect to a probability distribution π(•) on X , if π(dx)P (x, dy) = π(dy)P (y, dx), x, y ∈ X Proposition. A Markov Chain is reversible with respect to π(•), then π(•) is the stationary distribution for the chain. Proof. By reversibility, we have x∈X π(dx)P (x, dy) = x∈X π(dy)P (y, dx) = π(dy) x∈X P (x, dy) = π(dy) Now the simplest way to construct a MCMC algorithm which satisfies reversibility is using Metropolis-Hastings algorithm. 2.1 The Metropolis-Hastings Algorithm. Suppose that π(•) has a (possibly unnormalized) density π u. Let Q(x, •) be essentially any other Markov Chain, whose transitions also have a (possibly unnormalized) density, i.e. Q(x, dy) ∝ q(x, y)dy. First choose some X 0. Then given X n , generate a proposal Y n+1 from Q(X n , •). In the meantime we flip a independent bias coin with probability of heads equals to α(X n , Y n+1), where α(x, y) = min 1, π u (y)q(y, x) π u (x)q(x, y) , π(x)q(x, y) = 0 And α(x, y) = 1 when π(x)q(x, y) = 0. Then if the coin is heads, we accept the proposal and set X n+1 = Y n+1. If the coin is tails, then we reject the proposal and set X n+1 = X n. Then we replace n by n + 1 and repeat. The reason we take α(x, y) as above is explain as follow. Proposition. The Metropolis-Hastings Algorithm produces a Markov Chain {X n } which is reversible with respect to π(•). Proof. We want to show for any x, y ∈ X , π(dx)P (x, dy) = π(dy)P (y, dx) whereȲ i = 1 J j Y ij. The Gibbs sampler then proceeds by updating the K + 3 variables according to the above conditional distributions. This is feasible since the conditional distributions are all easily simulated (IG and N).

Strongly ergodic Markov chains and rates of convergence using spectral conditions

Stochastic Processes and their Applications, 1978

For finite Markov chains the eigenvalues of P can be used to characterize the chain and also determine the geometric rate at which P" converges to Q in case P is ergodic. For infinite Markov chains the spectrum of P plays the analogous role. It follows from Theorem 3.1 that IIP" -Q[j G C@" if and only if P is strongly ergodic. The best possible rate for p is the spectral radius of P-Q, which in this case is the same as sup{lh I: h E o(P), A # 1). The ques?ion of when this best rate equals S(P) is considered for both discrete and continuous time chains. Two characterizations of strong ergodicity are given using spectral properties of P -Q (Theorem 3 .S) and spectral properties of a submatrix of P (Theorem 3.16).