Learning an Overcomplete Dictionary using a Cauchy Mixture Model for Sparse Decay (original) (raw)
Related papers
Efficient Highly Over-Complete Sparse Coding Using a Mixture Model
European Conference on Computer Vision, 2010
Sparse coding of sensory data has recently attracted notable attention in research of learning useful features from the unlabeled data. Empirical studies show that mapping the data into a signicantly higher- dimensional space with sparse coding can lead to superior classication performance. However, computationally it is challenging to learn a set of highly over-complete dictionary bases and to encode the
Estimation of sparse nonnegative sources from noisy overcomplete mixtures using MAP
Neural Computation, 2009
In this paper, a new algorithm for estimating sparse non-negative sources from a set of noisy linear mixtures is proposed. In particular, difficult situations with high noise levels and more sources than sensors (underdetermined case) are considered. It is shown that, when sources are very sparse in time and overlapped at some locations, they can be recovered even with very low SNR and by using much fewer sensors than sources. A theoretical analysis based on Bayesian estimation tools is included showing strong connections with algorithms in related areas of research such as ICA, NMF, FOCUSS, and sparse representation of data with overcomplete dictionaries. Our algorithm uses a Bayesian approach by modelling sparse signals through mixed-state random variables. This new model for priors imposes 0 norm based sparsity. We start our analysis for the case of non-overlapped sources (1−sparse), which allows us to simplify the search of the posterior maximum avoiding a combinatorial search. General algorithms for overlapped cases, such as 2−sparse and k−sparse sources, are derived by using the algorithm for 1−sparse signals recursively. Additionally, a combination of our MAP algorithm with the NN-KSVD algorithm is proposed for estimating the mixing matrix and the sources simultaneously in a real blind fashion. A complete set of simulation results is included showing the performance of our algorithm.
Statistical method for sparse coding of speech including a linear predictive model
Physica A-statistical Mechanics and Its Applications, 2006
Recently, different methods for obtaining sparse representations of a signal using dictionaries of waveforms have been studied. They are often motivated by the way the brain seems to process certain sensory signals. Algorithms have been developed using a specific criterion to choose the waveforms occurring in the representation. The waveforms are choosen from a fixed dictionary and some algorithms also construct them as a part of the method. In the case of speech signals, most approaches do not take into consideration the important temporal correlations that are exhibited. It is known that these correlations are well approximated by linear models. Incorporating this a priori knowledge of the signal can facilitate the search for a suitable representation solution and also can help with its interpretation. Lewicki proposed a method to solve the noisy and overcomplete independent component analysis problem. In the present paper we propose a modification of this statistical technique for obtaining a sparse representation using a generative parametric model. The representations obtained with the method proposed here and other techniques are applied to artificial data and real speech signals, and compared using different coding costs and sparsity measures. The results show that the proposed method achieves more efficient representations of these signals compared to the others. A qualitative analysis of these results is also presented, which suggests that the restriction imposed by the parametric model is helpful in discovering meaningful characteristics of the signals.
Learning sparsely used overcomplete dictionaries
We consider the problem of sparse coding, where each sample consists of a sparse linear combination of a set of dictionary atoms, and the task is to learn both the dictionary elements and the mixing coefficients. Alternating minimization is a popular heuristic for sparse coding, where the dictionary and the coefficients are estimated in alternate steps, keeping the other fixed. Typically, the coefficients are estimated via ℓ 1 minimization, keeping the dictionary fixed, and the dictionary is estimated through least squares, keeping the coefficients fixed. In this paper, we establish local linear convergence for this variant of alternating minimization and establish that the basin of attraction for the global optimum (corresponding to the true dictionary and the coefficients) is O 1/s 2 , where s is the sparsity level in each sample and the dictionary satisfies RIP. Combined with the recent results of approximate dictionary estimation, this yields provable guarantees for exact recovery of both the dictionary elements and the coefficients, when the dictionary elements are incoherent.
Blind source separation and sparse component analysis of overcomplete mixtures
2004
We formulate conditions (k-SCA-conditions) under which we can represent a given (m × N )-matrix X (data set) uniquely (up to scaling and permutation) as a multiplication of m × n and n × N matrices A and S (often called mixing matrix or dictionary and source matrix, respectively), such that S is sparse of level n−m+k in sense that each column of S has at least n − m + k zero elements. We call this the k-Sparse Component Analysis problem (k-SCA). Conditions on a matrix S are presented such that the k-SCA-conditions are satisfied for the matrix X = AS, where A is an arbitrary matrix from some class. This is the Blind Source Separation problem and the above conditions are called identifiability conditions.
Sparse Modeling with Applications to Speech Processing: A Survey
Indonesian Journal of Electrical Engineering and Computer Science, 2016
Nowadays, there has been a growing interest in the study of sparse approximation of signals. Using an over-complete dictionary consisting of prototype signals or atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, source separation, enhancement, and regularization in inverse problems, feature extraction, and more. This article introduces a literature review of sparse coding applications in the field of speech processing.
Audio Source Separation using Sparse Representations
Principles, Algorithms and Systems
We address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research.
Random Models for Sparse Signals Expansion on Unions of Bases With Application to Audio Signals
IEEE Transactions on Signal Processing, 2000
A new approach for signal expansion with respect to hybrid dictionaries, based upon probabilistic modeling is proposed and studied, with emphasis on audio signal processing applications. The signal is modeled as a sparse linear combination of waveforms, taken from the union of two orthonormal bases, with random coefficients. The behavior of the analysis coefficients, namely inner products of the signal with all basis functions, is studied in details, which shows that these coefficients may generally be classified in two categories: significant coefficients versus unsignificant coefficients. Conditions ensuring the feasibility of such a classification are given. When the classification is possible, it leads to efficient estimation algorithms, that may in turn be used for de-noising or coding purpose. The proposed approach is illustrated by numerical experiments on audio signals, using MDCT bases.
Regularized low-coherence overcomplete dictionary learning for sparse signal decomposition
2016 24th European Signal Processing Conference (EUSIPCO), 2016
This paper deals with learning an overcomplete set of atoms that have low mutual coherence. To this aim, we propose a new dictionary learning (DL) problem that enables a control on the amounts of the decomposition error and the mutual coherence of the atoms of the dictionary. Unlike existing methods, our new problem directly incorporates the mutual coherence term into the usual DL problem as a regularizer. We also propose an efficient algorithm to solve the new problem. Our new algorithm uses block coordinate descent, and updates the dictionary atom-by-atom, leading to closed-form solutions. We demonstrate the superiority of our new method over existing approaches in learning low-coherence overcomplete dictionaries for natural image patches.