Accelerated Variational Dirichlet Process Mixtures (original) (raw)
Related papers
Variational inference for Dirichlet process mixtures
2006
Abstract Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives.
Lecture Notes in Computer Science, 2012
In this paper, we introduce a nonparametric Bayesian approach for clustering based on both Dirichlet processes and generalized Dirichlet (GD) distribution. Thanks to the proposed approach, the obstacle of estimating the correct number of clusters is sidestepped by assuming an infinite number of components. The problems of overfitting and underfitting the data are also prevented due to the nature of the nonparametric Bayesian framework. The proposed model is learned through a variational method in which the whole inference process is analytically tractable with closed-form solutions. The effectiveness and merits of the proposed clustering approach are investigated on two challenging real applications namely anomaly intrusion detection and image spam filtering.
Clustering consistency with Dirichlet process mixtures
Biometrika
Summary Dirichlet process mixtures are flexible nonparametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assu...
Dirichlet Process Parsimonious Mixtures for clustering
arXiv (Cornell University), 2015
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The
Applied Intelligence, 2015
We developed a variational Bayesian learning framework for the infinite generalized Dirichlet mixture model (i.e. a weighted mixture of Dirichlet process priors based on the generalized inverted Dirichlet distribution) that has proven its capability to model complex multidimensional data. We also integrate a "feature selection" approach to highlight the features that are most informative in order to construct an appropriate model in terms of clustering accuracy. Experiments on synthetic data as well as real data generated from visual scenes and handwritten digits datasets illustrate and validate the proposed approach.
Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions
Information Systems Frontiers, 2020
Data clustering as an unsupervised method has been one of the main attention-grabbing techniques and a large class of tasks can be formulated by this method. Mixture models as a branch of clustering methods have been used in various fields of research such as computer vision and pattern recognition. To apply these models, we need to address some problems such as finding a proper distribution that properly fits data, defining model complexity and estimating the model parameters. In this paper, we apply scaled Dirichlet distribution to tackle the first challenge and propose a novel online variational method to mitigate the other two issues simultaneously. The effectiveness of the proposed work is evaluated by four challenging real applications, namely, text and image spam categorization, diabetes and hepatitis detection. Keywords Infinite mixture models • Dirichlet process mixtures of scaled Dirichlet distributions • Online variational learning • Spam categorization • Diabetes • Hepatitis. 1 Introduction Considerable growth in technologies results in generating various types of digital data such as text, image and video which provides opportunities to extract valuable information and meaningful patterns. Thus, finding an efficient model to describe data has become an interesting Narges Manouchehri
ClusterCluster: Parallel Markov Chain Monte Carlo for Dirichlet Process Mixtures
The Dirichlet process (DP) is a fundamental mathematical tool for Bayesian nonparametric modeling, and is widely used in tasks such as density estimation, natural language processing, and time series modeling. Although MCMC inference methods for the DP often provide a gold standard in terms asymptotic accuracy, they can be computationally expensive and are not obviously parallelizable. We propose a reparameterization of the Dirichlet process that induces conditional independencies between the atoms that form the random measure. This conditional independence enables many of the Markov chain transition operators for DP inference to be simulated in parallel across multiple cores. Applied to mixture modeling, our approach enables the Dirichlet process to simultaneously learn clusters that describe the data and superclusters that define the granularity of parallelization. Unlike previous approaches, our technique does not require alteration of the model and leaves the true posterior dist...
Distributed Inference for Dirichlet Process Mixture Models
2015
Bayesian nonparametric mixture models based on the Dirichlet process (DP) have been widely used for solving problems like clustering, density estimation and topic modelling. These models make weak assumptions about the underlying process that generated the observed data. Thus, when more data are collected, the complexity of these models can change accordingly. These theoretical properties often lead to superior predictive performance when compared to traditional finite mixture models. However, despite the increasing amount of data available, the application of Bayesian nonparametric mixture models is so far limited to relatively small data sets. In this paper, we propose an efficient distributed inference algorithm for the DP and the HDP mixture model. The proposed method is based on a variant of the slice sampler for DPs. Since this sampler does not involve a pre-determined truncation, the stationary distribution of the sampling algorithm is unbiased. We provide both local thread-l...
A sequential algorithm for fast fitting of Dirichlet process mixture models
2013
In this article we propose an improvement on the sequential updating and greedy search (SUGS) algorithm for fast fitting of Dirichlet process mixture models. The SUGS algorithm provides a means for very fast approximate Bayesian inference for mixture data which is particularly of use when data sets are so large that many standard Markov chain Monte Carlo (MCMC) algorithms cannot be applied efficiently, or take a prohibitively long time to converge. In particular, these ideas are used to initially interrogate the data, and to refine models such that one can potentially apply exact data analysis later on. SUGS relies upon sequentially allocating data to clusters and proceeding with an update of the posterior on the subsequent allocations and parameters which assumes this allocation is correct. Our modification softens this approach, by providing a probability distribution over allocations, with a similar computational cost; this approach has an interpretation as a variational Bayes procedure and hence we term it variational SUGS (VSUGS). It is shown in simulated examples that VSUGS can out-perform, in terms of density estimation and classification, the original SUGS algorithm in many scenarios.