Formalization of a Stochastic Approximation Theorem (original) (raw)

Generalization of a Result of Fabian on the Asymptotic Normality of Stochastic Approximation

Automatica, 2018

Stochastic approximation (SA) is a general framework for analyzing the convergence of a large collection of stochastic root-finding algorithms. The Kiefer-Wolfowitz and stochastic gradient algorithms are two well-known (and widely used) examples of SA. Because of their applicability to a wide range of problems, many results have been obtained regarding the convergence properties of SA procedures. One important reference in the literature, Fabian (1968), derives general conditions for the asymptotic normality of the SA iterates. Since then, many results regarding asymptotic normality of SA procedures have relied heavily on Fabian's theorem. Unfortunately, some of the assumptions of Fabian's result are not applicable to some modern implementations of SA in control and learning. In this paper we explain the nature of this incompatibility and show how Fabian's theorem can be generalized to address the issue.

Rates of Convergence of Adaptive Step-Size of Stochastic Approximation Algorithms

Journal of Mathematical Analysis and Applications, 2000

We propose a new adaptive algorithm with decreasing step-size for stochastic approximations. The use of adaptive algorithms in various applications is widely spread across fields such as system identification and adaptive control. We analyze the rate of convergence of the proposed algorithms. An averaging algorithm, on its optimality of the rate of convergence, is used to control the step sizes. Our proofs are based on recent results in stochastic approximations and Gaussian Approximation Theorem.

Equivalent necessary and sufficient conditions on noise sequences for stochastic approximation algorithms

1996

We consider stochastic approximation algorithms on a general Hilbert space, and study four conditions on noise sequences for their analysis: Kushner and Clark's condition, Chen's condition, a decomposition condition, and Kulkarni and Horn's condition. We discuss various properties of these conditions. In our main result we show that the four conditions are all equivalent, and are both necessary and sufficient for convergence of stochastic approximation algorithms under appropriate assumptions.

Convergence Rates and Decoupling in Linear Stochastic Approximation Algorithms

arXiv (Cornell University), 2015

Almost sure convergence rates for linear algorithms 2 , 1 , these assumptions are implied by the Marcinkiewicz strong law of large numbers, which allows the {A k } and {b k } to have heavytails, long-range dependence or both. Finally, corroborating experimental outcomes and decreasing-gain design considerations are provided.

Stochastic Approximation: From Statistical Origin to Big-Data, Multidisciplinary Applications

Statistical Science, 2021

Stochastic approximation was introduced in 1951 to provide a new theoretical framework for root finding and optimization of a regression function in the then-nascent field of statistics. This review shows how it has evolved in response to other developments in statistics, notably time series and sequential analysis, and to applications in artificial intelligence, economics, and engineering. Its resurgence in the Big Data Era has led to new advances in both theory and applications of this microcosm of statistics and data science.

A deterministic analysis of stochastic approximation with randomized directions

1998

Abstract We study the convergence of two stochastic approximation algorithms with randomized directions: the simultaneous perturbation stochastic approximation algorithm and the random direction Kiefer-Wolfowitz algorithm. We establish deterministic necessary and sufficient conditions on the random directions and noise sequences for both algorithms, and these conditions demonstrate the effect of the “random” directions on the “sample-path” behavior of the algorithms studied.

A Stochastic Approximation Algorithm with Step-Size Adaptation

Journal of Mathematical Sciences, 2000

We consider the following stochastic approximation algorithm of searching for the zero point x * of a function ϕ: x t+1 = x t − γ t y t , y t = ϕ(x t ) + ξ t , where y t are observations of ϕ and ξ t is the random noise. The step sizes γ t of the algorithm are random, the increment γ t+1 − γ t depending on γ t and on y t y t−1 in a rather general form. Generally, it is meant that γ t increases as y t y t−1 > 0, and decreases otherwise. It is proved that the algorithm converges to x * almost surely. This result generalizes similar results of and , where γ t+1 − γ t is assumed to depend only on γ t and sgn(y t y t−1 ) and not on the magnitude of y t y t−1 .