Improved estimation of hidden Markov model parameters from multiple observation sequences (original) (raw)

A modified Baum-Welch algorithm for hidden Markov models with multiple observation spaces

IEEE Transactions on Speech and Audio Processing, 2001

In this paper, we derive an algorithm similar to the well-known Baum-Welch algorithm for estimating the parameters of a hidden Markov model (HMM). The new algorithm allows the observation PDF of each state to be defined and estimated using a different feature set. We show that estimating parameters in this manner is equivalent to maximizing the likelihood function for the standard parameterization of the HMM defined on the input data space. The processor becomes optimal if the state-dependent feature sets are sufficient statistics to distinguish each state individually from a common state.

On the structure of hidden Markov models

Pattern Recognition Letters, 2004

This paper attempts to overcome the local convergence problem of the Expectation Maximization (EM) based training of the Hidden Markov Model (HMM) in speech recognition. We propose a hybrid algorithm, Simulated Annealing Stochastic version of EM (SASEM), combining Simulated Annealing with EM that reformulates the HMM estimation process using a stochastic step between the EM steps and the SA. The stochastic processes of SASEM inside EM can prevent EM from converging to a local maximum and find improved estimation for HMM using the global convergence properties of SA. Experiments on the TIMIT speech corpus show that SASEM obtains higher recognition accuracies than the EM.

Exploitation of unlabeled sequences in hidden markov models

2003

Abstract This paper presents a method for effectively using unlabeled sequential data in the learning of hidden Markov models (HMMs). With the conventional approach, class labels for unlabeled data are assigned deterministically by HMMs learned from labeled data. Such labeling often becomes unreliable when the number of labeled data is small. We propose an extended Baum-Welch (EBW) algorithm in which the labeling is undertaken probabilistically and iteratively so that the labeled and unlabeled data likelihoods are improved.

Training hidden Markov models with multiple observations-a combinatorial method

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000

Hidden Markov models (HMMs) are stochastic models capable of statistical learning and classification. They have been applied in speech recognition and handwriting recognition because of their great adaptability and versatility in handling sequential signals. On the other hand, as these models have a complex structure, and also because the involved data sets usually contain uncertainty, it is difficult to analyze the multiple observation training problem without certain assumptions. For many years researchers have used Levinson's training equations in speech and handwriting applications simply assuming that all observations are independent of each other. This paper present a formal treatment of HMM multiple observation training without imposing the above assumption. In this treatment, the multiple observation probability is expressed as a combination of individual observation probabilities without losing generality. This combinatorial method gives one more freedom in making different dependenceindependence assumptions. By generalizing Baum's auxiliary function into this framework and building up an associated objective function using Lagrange multiplier method, it is proved that the derived training equations guarantee the maximization of the objective function. Furthermore, we show that Levinson's training equations can be easily derived as a special case in this treatment.

Hidden Markov Models for Pattern Recognition

Markov Model - Theory and Applications

Hidden Markov Models (HMMs) are the most popular recognition algorithm for pattern recognition. Hidden Markov Models are mathematical representations of the stochastic process, which produces a series of observations based on previously stored data. The statistical approach in HMMs has many benefits, including a robust mathematical foundation, potent learning and decoding techniques, effective sequence handling abilities, and flexible topology for syntax and statistical phonology. The drawbacks stem from the poor model discrimination and irrational assumptions required to build the HMMs theory, specifically the independence of the subsequent feature frames (i.e., input vectors) and the first-order Markov process. The developed algorithms in the HMM-based statistical framework are robust and effective in real-time scenarios. Furthermore, Hidden Markov Models are frequently used in real-world applications to implement gesture recognition and comprehension systems. Every state of the m...

Fast state discovery for HMM model selection and learning

Proc. AISTATS, 2007

Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results.

Improved classification using hidden Markov averaging from multiple observation sequences

2002

The enormous popularity of Hidden Markov models (HMMs) in spatio-temporal pattern recognition is largely due to the ability to "learn" model parameters from observation sequences through the Baum-Welch and other re-estimation procedures. In this study, HMM parameters are estimated from an ensemble of models trained on individual observation sequences. The proposed methods are shown to provide superior classification performance to competing methods.

On the Initialization of Parameters of Hidden Markov Models

2021

Hidden Markov Models (HMMs) have been successfully applied to problems from different areas over the last years. Independent of the application it is well known that an important step considering the use of HMMs is the initialization of the parameters of the model. This initialization should take into account the knowledge about the addressed problem and also optimization techniques to estimate the best initial parameters given a cost function. Currently, there exist distinct techniques to initialize and evaluate HMMs, however, there is no a common consensus concerning the choice of these tools. In this paper we illustrate, through examples, the effects of an inadequate initialization of HMM parameters, and also discuss important issues on the selection or estimate of these parameters. The presented results and examples highlight the relevance of studying the initial model parameters prior to signal modeling. 1

A reevaluation and benchmark of hidden Markov models

Hidden Markov models are frequently used in handwriting-recognition applications. While a large number of methodological variants have been developed to accommodate different use cases, the core concepts have not been changed much. In this paper, we develop a number of datasets to bench-mark our own implementation as well as various other tool kits. We introduce a gradual scale of difficulty that allows comparison of datasets in terms of separability of classes. Two experiments are performed to review the basic HMM functions, especially aimed at evaluating the role of the transition probability matrix. We found that the transition matrix may be far less important than the observation probabilities. Furthermore, the traditional training methods are not always able to find the proper (true) topology of the transition matrix. These findings support the view that the quality of the features may require more attention than the aspect of temporal modelling addressed by HMMs.

Parameter estimation of multi-dimensional hidden Markov models - a scalable approach

2005

Parameter estimation is a key computational issue in all statistical image modeling techniques. In this paper, we explore a computationally efficient parameter estimation algorithm for multi-dimensional hidden Markov models. 2-D HMM has been applied to supervised aerial image classification and comparisons have been made with the first proposed estimation algorithm. An extensive parametric study has been performed with 3-D HMM and the scalability of the estimation algorithm has been discussed. Results show the great applicability of the explored algorithm to multi-dimensional HMM based image modeling applications.