Combination of acoustic models in continuous speech recognition hybrid systems (original) (raw)

The combination of multiple sources of information has been an attractive approach in different areas. That is the case of speech recognition area where several combination methods have been presented. Our hybrid MLP/HMM systems use acoustic models based on different set of features and different MLP classifier structures. In this work we developed a method combining phoneme probabilities generated by the different acoustic models trained on distinct feature extraction processes. Two different algorithms were implemented for combining the acoustic models probabilities. The first covers the combination in the probability domain and the second one in the log-probability domain. We made combinations of two and three alternative baseline systems where was possible to obtain relative improvements on word error rate larger than 20% for a large vocabulary speaker independent continuous speech recognition task.