Understanding HMM training for video gesture recognition (original) (raw)

Model Structure Selection & Training Algorithms for a HMM Gesture Recognition System

Hidden Markov models using the Fully-Connected, Left-Right and Left-Right Banded model structures are applied to the problem of alphabetical letter gesture recognition. We examine the effect of training techniques, in particular the Baum-Welch and Viterbi Path Counting techniques, on each of the model structures. We show that recognition rates improve when moving from a Fully-Connected model to a Left-Right model and a Left-Right Banded 'staircase' model with peak recognition rates of 84.8%, 92.31% and 97.31% respectively. The Left-Right Banded model in conjunction with the Viterbi Path Counting present the best performance. Direct calculation of model parameters from analysis of the physical system was also tested, yielding a peak recognition rate of 92%, but the simplicity and efficiency of this approach is of interest.

Model structure selection & training algorithms for an HMM gesture recognition system

Frontiers in Handwriting …, 2004

Hidden Markov models using the Fully-Connected, Left-Right and Left-Right Banded model structures are applied to the problem of alphabetical letter gesture recognition. We examine the effect of training techniques, in particular the Baum-Welch and Viterbi Path Counting techniques, on each of the model structures. We show that recognition rates improve when moving from a Fully-Connected model to a Left-Right model and a Left-Right Banded 'staircase' model with peak recognition rates of 84.8%, 92.31% and 97.31% respectively. The Left-Right Banded model in conjunction with the Viterbi Path Counting present the best performance. Direct calculation of model parameters from analysis of the physical system was also tested, yielding a peak recognition rate of 92%, but the simplicity and efficiency of this approach is of interest.

Model Structure Selection and Training Algorithms for an HMM Gesture Recognition System

Ninth International Workshop on Frontiers in Handwriting Recognition, 2004

Hidden Markov models using the Fully-Connected, Left-Right and Left-Right Banded model structures are applied to the problem of alphabetical letter gesture recognition. We examine the effect of training techniques, in particular the Baum-Welch and Viterbi Path Counting techniques, on each of the model structures. We show that recognition rates improve when moving from a Fully-Connected model to a Left-Right model and a Left-Right Banded 'staircase' model with peak recognition rates of 84.8%, 92.31% and 97.31% respectively. The Left-Right Banded model in conjunction with the Viterbi Path Counting present the best performance. Direct calculation of model parameters from analysis of the physical system was also tested, yielding a peak recognition rate of 92%, but the simplicity and efficiency of this approach is of interest.

Evaluation of HMM training algorithms for letter hand gesture recognition

2003

The paper introduces an application using computer vision for letter hand gesture recognition. A digital camera records a video stream of hand gestures. The hand is automatically segmented, the position of the hand centroid is calculated in each frame, and a trajectory of the hand is determined. After smoothing the trajectory, a sequence of angles of motion along the trajectory is calculated and quantized to form a discrete observation sequence. Hidden Markov models (HMMs) are used to recognize the letters. Baum Welch and Viterbi path counting algorithms are applied for training the HMMs. Our system recognizes all 26 letters from A to Z and the database contains 30 example videos of each letter gesture. We achieve an average recognition rate of about 90 percent. A motivation for the development of this system is to provide an alternate text input mechanism for camera enabled handheld devices, such as video mobile phones and PDAs.

Supervised training of conversive hidden non-markovian models: increasing usability for gesture recognition

2012

Hidden non-Markovian Models (HnMMs) were introduced and formalized as an extension of Hidden Markov Models for the analysis of partially observable stochastic processes. Their main advantage over HMM is the possibility to model arbitrary distributions for state transition duration, so that the unobservable stochastic process needs not to be Markovian. Besides academic examples, HnMMs were applied to gesture recognition and performed well in distinguishing similar gestures with different execution speeds. While the Proxel-Method enabled the evaluation for arbitrary HnMMs, there was no opportunity to train these models. Therefore, the models for different gestures had to be parameterized manually. This fact reduced the applicability in real gesture recognition dramatically. This paper presents a solution to this problem, introducing a supervised training approach that increases the applicability of HnMMs in gesture recognition.

Hand Gesture Recogrution Using Hidden Markov Models

Hand gesture recognition from visual images has a nuniber of potential application in HCI (human coniputer interaction), machine vision, VR(virtua1 reality), machine control in the industryfield, and so on. Most conventional approaches to hand gesture recognition have employed datagloves. But, for more natural interface, hand gesture must be recognized from visual images as in the communication between humans withouf using any external devices. Our research is intended to draw and edit graphic elements by hand gesture. Up to now, many methods for hand gesture recognition have been proposed such as syntactical analysis, neural based approach, HMM (hidden Markov model) based recognition. As gesture is the continuous motion on the sequential time series, HMM must be a prominent recognition tool. Though each analysis method has me?fts and denierits, the most important thing in hand gesture recognition is what the input features are that represent very well the characteristics of moving hand gesture. In our research, we consider the planar hand gesture in front of camera and therefore 8-directional chain codes as input vectors. For training an HMM network, a simple context modeling nlethod is embedded as training on "left-to-right" HMM model. This model is applied to draw graphic elements such as triangle, rectangular, circle, arc, horizontal line, vertical line and edit the specified graphic elements such as copy, delete, move, swap, undo, close. Therefore, the overall objectives are 12 dynamic gestures. In our experiments, we have good recognition results on a pre-confined test environnient : 1) the spotting time is synchronized at the static state of a hand, 2) other limb parts except hands is motionless, 3) the change of hand posture during movement is meaningless. Our system will be advanced by adopting more diverse input features representing well dynamic features of hand gestures . 0-7803-4053-1/97/$10.00 e 1997 IEEE

Real-Time American Sign Language Recognition from Video Using Hidden Markov Models

Computational Imaging and Vision, 1997

Hidden Markov models (HMM's) have been used prominently and successfully in speech recognition and, more recently, in handwriting recognition. Consequently, they seem ideal for visual recognition of complex, structured hand gestures such as are found in sign language. We describe two experiments that. demonstrate a real-time HMM-based system for recognizing sentence level American Sign Language (ASL) without explicitly modeling the fingers. The first experiment tracks hands wearing colored gloves and attains a word accuracy of 99%. The second experiment tracks hands without gloves and attains a word accuracy of 92%. Both experiments have a 40 word lexicon.

Visual sign language recognition based on HMMs and auto-regressive HMMs

Gesture in Human- …, 2006

A sign language recognition system based on Hidden Markov Models(HMMs) and Auto-regressive Hidden Markov Models(ARHMMs) has been proposed in this paper. ARHMMs fully consider the observation relationship and are helpful to discriminate signs which don't have obvious state transitions while similar in motion trajectory. ARHMM which models the observation by mixture conditional linear Gaussian is proposed for sign language recognition. The corresponding training and recognition algorithms for ARHMM are also developed. A hybrid structure to combine ARHMMs with HMMs based on the trick of using an ambiguous word set is presented and the advantages of both models are revealed in such a frame work.

Deciding of HMM parameters based on number of critical points for gesture recognition from motion capture data

This paper presents a method of choosing number of states of a HMM based on number of critical points of the motion capture data. The choice of Hidden Markov Models(HMM) parameters is crucial for recognizer's performance as it is the first step of the training and cannot be corrected automatically within HMM. In this article we define predictor of number of states based on number of critical points of the sequence and test its effectiveness against sample data.