Incremental learning of nonparametric Bayesian mixture models (original) (raw)
Related papers
Incremental Learning of Nonparametric Bayesian Mixture Models: Extended Thesis Chapter
2011
1 Abstract Clustering is a fundamental task in many vision applications. To date, most clustering algorithms work in a batch setting and training examples must be gathered in a large group before learning can begin. Here we explore incremental clustering, in which data can arrive continuously. We present a novel incremental model-based clustering algorithm based on nonparametric Bayesian methods, which we call Memory Bounded Variational Dirichlet Process (MB-VDP).
International Joint Conference on Artificial Intelligence, 2013
In this paper, we develop a clustering approach based on variational incremental learning of a Dirichlet process of generalized Dirichlet (GD) distributions. Our approach is built on nonparametric Bayesian analysis where the determination of the complexity of the mixture model (i.e. the number of components) is sidestepped by assuming an infinite number of mixture components. By leveraging an incremental variational inference algorithm, the model complexity and all the involved model's parameters are estimated simultaneously and effectively in a single optimization framework. Moreover, thanks to its incremental nature and Bayesian roots, the proposed framework allows to avoid over-and under-fitting problems, and to offer good generalization capabilities. The effectiveness of the proposed approach is tested on a challenging application involving visual scenes clustering.
Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions
Information Systems Frontiers, 2020
Data clustering as an unsupervised method has been one of the main attention-grabbing techniques and a large class of tasks can be formulated by this method. Mixture models as a branch of clustering methods have been used in various fields of research such as computer vision and pattern recognition. To apply these models, we need to address some problems such as finding a proper distribution that properly fits data, defining model complexity and estimating the model parameters. In this paper, we apply scaled Dirichlet distribution to tackle the first challenge and propose a novel online variational method to mitigate the other two issues simultaneously. The effectiveness of the proposed work is evaluated by four challenging real applications, namely, text and image spam categorization, diabetes and hepatitis detection. Keywords Infinite mixture models • Dirichlet process mixtures of scaled Dirichlet distributions • Online variational learning • Spam categorization • Diabetes • Hepatitis. 1 Introduction Considerable growth in technologies results in generating various types of digital data such as text, image and video which provides opportunities to extract valuable information and meaningful patterns. Thus, finding an efficient model to describe data has become an interesting Narges Manouchehri
Discriminative Bayesian Nonparametric Clustering
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017
We propose a general framework for discriminative Bayesian nonparametric clustering to promote the inter-discrimination among the learned clusters in a fully Bayesian nonparametric (BNP) manner. Our method combines existing BNP clustering and discriminative models by enforcing latent cluster indices to be consistent with the predicted labels resulted from probabilistic discriminative model. This formulation results in a well-defined generative process wherein we can use either logistic regression or SVM for discrimination. Using the proposed framework, we develop two novel discriminative BNP variants: the discriminative Dirichlet process mixtures, and the discriminative-state infinite HMMs for sequential data. We develop efficient data-augmentation Gibbs samplers for posterior inference. Extensive experiments in image clustering and dynamic location clustering demonstrate that by encouraging discrimination between induced clusters, our model enhances the quality of clustering in com...
Applied Intelligence, 2015
We developed a variational Bayesian learning framework for the infinite generalized Dirichlet mixture model (i.e. a weighted mixture of Dirichlet process priors based on the generalized inverted Dirichlet distribution) that has proven its capability to model complex multidimensional data. We also integrate a "feature selection" approach to highlight the features that are most informative in order to construct an appropriate model in terms of clustering accuracy. Experiments on synthetic data as well as real data generated from visual scenes and handwritten digits datasets illustrate and validate the proposed approach.
Neural Processing Letters, 2015
In this work, we develop a statistical framework for data clustering which uses hierarchical Dirichlet processes and Beta-Liouville distributions. The parameters of this framework are leaned using two variational Bayes approaches. The first one considers batch settings and the second one takes into account the dynamic nature of real data. Experimental results based on a challenging problem namely visual scenes categorization demonstrate the merits of the proposed framework.
Dirichlet Process Parsimonious Mixtures for clustering
arXiv (Cornell University), 2015
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The
Pattern Analysis and Applications, 2019
Developing effective machine learning methods for multimedia data modeling continues to challenge computer vision scientists. The capability of providing effective learning models can have significant impact on various applications. In this work, we propose a nonparametric Bayesian approach to address simultaneously two fundamental problems, namely clustering and feature selection. The approach is based on infinite generalized Dirichlet (GD) mixture models constructed through the framework of Dirichlet process and learned using an accelerated variational algorithm that we have developed. Furthermore, we extend the proposed approach using another nonparametric Bayesian prior, namely Pitman-Yor process, to construct the infinite generalized Dirichlet mixture model. Our experiments, which were conducted through synthetic data sets, the clustering analysis of real-world data sets and a challenging application, namely automatic human action recognition, indicate that the proposed framework provides good modeling and generalization capabilities.