Sparsity in Variational Autoencoders (original) (raw)

A Winner Take All Method for Training Sparse Convolutional Autoencoders

We explore combining the benefits of convolutional architectures and autoencoders for learning deep representations in an unsupervised manner. A major challenge is to achieve appropriate sparsity among hidden variables, since neighbouring variables in each feature map tend to be highly correlated and a suppression mechanism is therefore needed. Previously, deconvolutional networks and convolutional predictive sparse decomposition have been used to construct systems that have a recognition pathway and a data generation pathway that are trained so that they agree and so that the hidden representation is sparse. We take a more direct approach and describe a way to train convolutional autoencoders layer by layer, where in each layer sparsity is achieved using a winner-take-all activation function within each feature map. Learning is computationally efficient and we show that our method can be used to train shallow and deep convolutional autoencoders whose representations can be used to achieve classification rates on the MNIST, CIFAR-10 and NORB datasets that are competitive with the state of the art.

Jigsaw-VAE: Towards Balancing Features in Variational Autoencoders

ArXiv, 2020

The latent variables learned by VAEs have seen considerable interest as an unsupervised way of extracting features, which can then be used for downstream tasks. There is a growing interest in the question of whether features learned on one environment will generalize across different environments. We demonstrate here that VAE latent variables often focus on some factors of variation at the expense of others - in this case we refer to the features as ``imbalanced''. Feature imbalance leads to poor generalization when the latent variables are used in an environment where the presence of features changes. Similarly, latent variables trained with imbalanced features induce the VAE to generate less diverse (i.e. biased towards dominant features) samples. To address this, we propose a regularization scheme for VAEs, which we show substantially addresses the feature imbalance problem. We also introduce a simple metric to measure the balance of features in generated images.

Variance Loss in Variational Autoencoders

Machine Learning, Optimization, and Data Science, 2020

In this article, we highlight what appears to be major issue of Variational Autoencoders, evinced from an extensive experimentation with different networks architectures and datasets: the variance of generated data is sensibly lower than that of training data. Since generative models are usually evaluated with metrics such as the Fréchet Inception Distance (FID) that compare the distributions of (features of) real versus generated images, the variance loss typically results in degraded scores. This problem is particularly relevant in a two stage setting, where we use a second VAE to sample in the latent space of the first VAE. The minor variance creates a mismatch between the actual distribution of latent variables and those generated by the second VAE, that hinders the beneficial effects of the second stage. Renormalizing the output of the second VAE towards the expected normal spherical distribution, we obtain a sudden burst in the quality of generated samples. as also testified in terms of FID.

Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders

2021

Variational Autoencoders (VAEs) are powerful probabilistic models to learn representations of complex data distributions. One important limitation of VAEs is the strong prior assumption that latent representations learned by the model follow a simple uni-modal Gaussian distribution. Further, the variational training procedure poses considerable practical challenges. Recently proposed regularized autoencoders offer a deterministic autoencoding framework, that simplifies the original VAE objective and is significantly easier to train. Since these models only provide weak control over the learned latent distribution, they require an ex-post density estimation step to generate samples comparable to those of VAEs. In this paper, we propose a simple and end-to-end trainable deterministic autoencoding framework, that efficiently shapes the latent space of the model during training and utilizes the capacity of expressive multi-modal latent distributions. The proposed training procedure prov...

Variational Autoencoders Without the Variation

arXiv (Cornell University), 2022

Variational autoencdoers (VAE) are a popular approach to generative modelling. However, exploiting the capabilities of VAEs in practice can be difficult. Recent work on regularised and entropic autoencoders have begun to explore the potential, for generative modelling, of removing the variational approach and returning to the classic deterministic autoencoder (DAE) with additional novel regularisation methods. In this paper we empirically explore the capability of DAEs for image generation without additional novel methods and the effect of the implicit regularisation and smoothness of large networks. We find that DAEs can be used successfully for image generation without additional loss terms, and that many of the useful properties of VAEs can arise implicitly from sufficiently large convolutional encoders and decoders when trained on CIFAR-10 and CelebA.

Variations in Variational Autoencoders - A Comparative Evaluation

IEEE Access

Variational Auto-Encoders (VAEs) are deep latent space generative models which have been immensely successful in many applications such as image generation, image captioning, protein design, mutation prediction, and language models among others. The fundamental idea in VAEs is to learn the distribution of data in such a way that new meaningful data can be generated from the encoded distribution. This concept has led to tremendous research and variations in the design of VAEs in the last few years creating a field of its own, referred to as unsupervised representation learning. This paper provides a muchneeded comprehensive evaluation of the variations of the VAEs based on their end goals and resulting architectures. It further provides intuition as well as mathematical formulation and quantitative results of each popular variation, presents a concise comparison of these variations, and concludes with challenges and future opportunities for research in VAEs.

Shedding Light on Variational Autoencoders

2018

Deep neural networks provide the canvas to create models of millions of parameters to fit distributions involving an equally large number of random variables. The contribution of this study is twofold. First, we introduce a diffraction dataset containing computer-based simulations of a Young's interference experiment. Then, we demonstrate the adeptness of variational autoencoders to learn diffraction patterns and extract a latent feature that correlates with the physical wavelength.

Balancing Reconstruction Error and Kullback-Leibler Divergence in Variational Autoencoders

IEEE Access, 2020

Likelihood-based generative frameworks are receiving increasing attention in the deep learning community, mostly on account of their strong probabilistic foundation. Among them, Variational Autoencoders (VAEs) are reputed for their fast and tractable sampling and relatively stable training, but if not properly tuned they may easily produce poor generative performances. The loss function of Variational Autoencoders is the sum of two components, with somehow contrasting effects: the reconstruction loss, improving the quality of the resulting images, and the Kullback-Leibler divergence, acting as a regularizer of the latent space. Correctly balancing these two components is a delicate issue, and one of the major problems of VAEs. Recent techniques address the problem by allowing the network to learn the balancing factor during training, according to a suitable loss function. In this article, we show that learning can be replaced by a simple deterministic computation, expressing the balancing factor in terms of a running average of the reconstruction error over the last minibatches. As a result, we keep a constant balance between the two components along training: as reconstruction improves, we proportionally decrease KL-divergence in order to prevent its prevalence, that would forbid further improvements of the quality of reconstructions. Our technique is simple and effective: it clarifies the learning objective for the balancing factor, and it produces faster and more accurate behaviours. On typical datasets such as Cifar10 and CelebA, our technique sensibly outperforms all previous VAE architectures with comparable parameter capacity.

Sparse Autoencoders Using Non-smooth Regularization

2018 26th European Signal Processing Conference (EUSIPCO), 2018

Autoencoder, at the heart of a deep learning structure, plays an important role in extracting abstract representation of a set of input training patterns. Abstract representation contains informative features to demonstrate a large set of data patterns in an optimal way in certain applications. It is shown that through sparse regularization of outputs of the hidden units (codes) in an autoencoder, the quality of codes can be enhanced that leads to a higher learning performance in applications like classification. Almost all methods trying to achieve code sparsity in an autoencoder use a smooth approximation of 1 norm, as the best convex approximation of pseudo 0 norm. In this paper, we incorporate sparsity to autoencoder training optimization process using non-smooth convex 1 norm and propose an efficient algorithm to train the structure. The non-smooth 1 regularization have shown its efficiency in imposing sparsity in various applications including feature selection via lasso and sparse representation using basis pursuit. Our experimental results on three benchmark datasets show superiority of this term in training a sparse autoencoder over previously proposed ones. As a byproduct of the proposed method, it can also be used to apply different types of non-smooth regularizers to autoencoder training problem.

PRI-VAE: Principle-of-Relevant-Information Variational Autoencoders

2020

Although substantial efforts have been made to learn disentangled representations under the variational autoencoder (VAE) framework, the fundamental properties to the dynamics of learning of most VAE models still remain unknown and under-investigated. In this work, we first propose a novel learning objective, termed the principle-of-relevant-information variational autoencoder (PRI-VAE), to learn disentangled representations. We then present an information-theoretic perspective to analyze existing VAE models by inspecting the evolution of some critical information-theoretic quantities across training epochs. Our observations unveil some fundamental properties associated with VAEs. Empirical results also demonstrate the effectiveness of PRI-VAE on four benchmark data sets.