Ye-Yi Wang - Academia.edu (original) (raw)
Papers by Ye-Yi Wang
Bookmarks Related papers MentionsView impact
Proceedings of the ACM Web Conference 2022
Bookmarks Related papers MentionsView impact
Voice search is the technology underlying many spoken dialog applications that enable users to ac... more Voice search is the technology underlying many spoken dialog applications that enable users to access information using spoken queries. This paper reviews voice search technology, and proposes a new and effective method for computing semantic confidence measures. It explores the use of maximum entropy classifiers as confidence models, and investigates a feature selection algorithm that leads to an effective subset of prominent features for the classifier. The experimental results on a directory assistance application show that the reduced feature set not only makes the model more effective in handling different recognition and search engine combinations, but also results in a very informative confidence measure that is closely correlated with the actual voice search accuracy. Index Terms: voice search, directory assistance, confidence measure, Tf-Idf vector space model, maximum entropy model. 1.
Bookmarks Related papers MentionsView impact
This paper reports our recent development of a highly reliable call analysis technique that makes... more This paper reports our recent development of a highly reliable call analysis technique that makes novel use of automatic speech recognition (ASR), speech utterance classification and non-speech features. The main ideas include the use the NGram filler model to improve the ASR accuracy on important words in a message, and the integration of recognized utterance with non-speech features such as utterance length, and the use of utterance classification technique to interpret the message and extract additional information. Experimental evaluation shows that the use of the utterance length, recognized text, and the classifier’s confidence measure reduces the classification error rate to 2.5% of the test sets.
Bookmarks Related papers MentionsView impact
Representation learning has transformed the field of machine learning. Advances like ImageNet, wo... more Representation learning has transformed the field of machine learning. Advances like ImageNet, word2vec, and BERT demonstrate the power of pre-trained representations to accelerate model training. The effectiveness of these techniques derives from their ability to represent words, sentences, and images in context. Other entity types, such as people and topics, are crucial sources of context in enterprise use-cases, including organization, recommendation, and discovery of vast streams of information. But learning representations for these entities from private data aggregated across user shards carries the risk of privacy breaches. Personalizing representations by conditioning them on a single user’s content eliminates privacy risks while providing a rich source of context that can change the interpretation of words, people, documents, groups, and other entities commonly encountered in workplace data. In this paper, we explore methods that embed user-conditioned representations of pe...
Bookmarks Related papers MentionsView impact
Interspeech 2005, 2005
Bookmarks Related papers MentionsView impact
Interspeech 2010, 2010
Bookmarks Related papers MentionsView impact
Interspeech 2016, 2016
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
Bookmarks Related papers MentionsView impact
The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, 2004
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2013
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2011
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2008
Bookmarks Related papers MentionsView impact
IEEE Transactions on Audio, Speech, and Language Processing, 2008
Bookmarks Related papers MentionsView impact
International Conference on Acoustics, Speech, and Signal Processing, 1991
A novel connectionist system for dialog processing is described. Based on a script-like formalism... more A novel connectionist system for dialog processing is described. Based on a script-like formalism, the system consists of several modular neural networks which can track the semantic flow of a dialog. The system can be extended to understand and translate dialogs in a certain domain
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Proceedings of the ACM Web Conference 2022
Bookmarks Related papers MentionsView impact
Voice search is the technology underlying many spoken dialog applications that enable users to ac... more Voice search is the technology underlying many spoken dialog applications that enable users to access information using spoken queries. This paper reviews voice search technology, and proposes a new and effective method for computing semantic confidence measures. It explores the use of maximum entropy classifiers as confidence models, and investigates a feature selection algorithm that leads to an effective subset of prominent features for the classifier. The experimental results on a directory assistance application show that the reduced feature set not only makes the model more effective in handling different recognition and search engine combinations, but also results in a very informative confidence measure that is closely correlated with the actual voice search accuracy. Index Terms: voice search, directory assistance, confidence measure, Tf-Idf vector space model, maximum entropy model. 1.
Bookmarks Related papers MentionsView impact
This paper reports our recent development of a highly reliable call analysis technique that makes... more This paper reports our recent development of a highly reliable call analysis technique that makes novel use of automatic speech recognition (ASR), speech utterance classification and non-speech features. The main ideas include the use the NGram filler model to improve the ASR accuracy on important words in a message, and the integration of recognized utterance with non-speech features such as utterance length, and the use of utterance classification technique to interpret the message and extract additional information. Experimental evaluation shows that the use of the utterance length, recognized text, and the classifier’s confidence measure reduces the classification error rate to 2.5% of the test sets.
Bookmarks Related papers MentionsView impact
Representation learning has transformed the field of machine learning. Advances like ImageNet, wo... more Representation learning has transformed the field of machine learning. Advances like ImageNet, word2vec, and BERT demonstrate the power of pre-trained representations to accelerate model training. The effectiveness of these techniques derives from their ability to represent words, sentences, and images in context. Other entity types, such as people and topics, are crucial sources of context in enterprise use-cases, including organization, recommendation, and discovery of vast streams of information. But learning representations for these entities from private data aggregated across user shards carries the risk of privacy breaches. Personalizing representations by conditioning them on a single user’s content eliminates privacy risks while providing a rich source of context that can change the interpretation of words, people, documents, groups, and other entities commonly encountered in workplace data. In this paper, we explore methods that embed user-conditioned representations of pe...
Bookmarks Related papers MentionsView impact
Interspeech 2005, 2005
Bookmarks Related papers MentionsView impact
Interspeech 2010, 2010
Bookmarks Related papers MentionsView impact
Interspeech 2016, 2016
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
Bookmarks Related papers MentionsView impact
The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, 2004
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2013
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2011
Bookmarks Related papers MentionsView impact
IEEE Signal Processing Magazine, 2008
Bookmarks Related papers MentionsView impact
IEEE Transactions on Audio, Speech, and Language Processing, 2008
Bookmarks Related papers MentionsView impact
International Conference on Acoustics, Speech, and Signal Processing, 1991
A novel connectionist system for dialog processing is described. Based on a script-like formalism... more A novel connectionist system for dialog processing is described. Based on a script-like formalism, the system consists of several modular neural networks which can track the semantic flow of a dialog. The system can be extended to understand and translate dialogs in a certain domain
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact