Variational Information Maximization in Stochastic Environments (original) (raw)

Information maximization is a common framework of unsupervised learning, which may be used for extracting informative representations y of the observed patterns x. The key idea there is to maximize mutual information (MI), which is a formal measure of coding efficiency. Unfortunately, exact maximization of MI is computationally tractable only in a few special cases; more generally, approximations need to be considered. Here we describe a family of variational lower bounds on mutual information which gives rise to a formal and theoretically rigorous approach to information maximization in large-scale stochastic channels. We hope that the results presented in this work are potentially interesting for maximizing mutual information from several perspectives. First of all, our method optimizes a proper lower bound, rather than a surrogate objective criterion or an approximation of MI (which may only be accurate under specific asymptotic assumptions, and weak or even undefined when the as...