Yuqing Guo | Dublin City University (original) (raw)
Papers by Yuqing Guo
An important element in question answering systems is the analysis and interpretation of question... more An important element in question answering systems is the analysis and interpretation of questions. Using the NTCIR 5 Cross-Language Question Answering (CLQA) question test set we demonstrate that the accuracy of deep question analysis is dependent on the quantity and suitability of the available linguistic resources. We further demonstrate that applying question analysis tools developed on monolingual training materials to questions translated Chinese-English and English-Chinese using machine translation produces much reduced effectiveness in interpretation of the question. This latter result indicates that question analysis for CLQA should primarily be conducted in the question language prior to translation.
Bookmarks Related papers MentionsView impact
To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it i... more To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it is an open research question how well these approaches migrate to other languages. This paper surveys non-local dependency constructions in Chinese as represented in the Penn Chinese Treebank (CTB) and provides an approach for generating proper predicate-argument-modifier structures including NLDs from surface contextfree phrase structure trees. Our approach recovers non-local dependencies at the level of Lexical-Functional Grammar f-structures, using automatically acquired subcategorisation frames and f-structure paths linking antecedents and traces in NLDs. Currently our algorithm achieves 92.2% f-score for trace insertion and 84.3% for antecedent recovery evaluating on gold-standard CTB trees, and 64.7% and 54.7%, respectively, on CTBtrained state-of-the-art parser output trees.
Bookmarks Related papers MentionsView impact
Guo Yuqing Treebank Based Acquisition of Chinese Lfg Resources For Parsing and Generation Phd Thesis Dublin City University, Nov 1, 2009
Bookmarks Related papers MentionsView impact
Proceedings of the 13th European Workshop on Natural Language Generation, Sep 28, 2011
Bookmarks Related papers MentionsView impact
China-Ireland International Conference on Information and Communications Technologies (CIICT 2007), 2007
Bookmarks Related papers MentionsView impact
Proceedings of the Fifth International Natural Language Generation Conference on - INLG '08, 2008
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236), 2001
Semantic paragraph partition is an important problem in text structure analysis in an automatic a... more Semantic paragraph partition is an important problem in text structure analysis in an automatic abstracting system. For an article containing distinct headings, the paper presents heading models in Chinese text to divide an article into semantic paragraphs based on the recognition of headings. For an article not containing headings, the paper establishes a vector space model for the whole article
Bookmarks Related papers MentionsView impact
This paper describes log-linear models for a general-purpose sentence realizer based on de- pende... more This paper describes log-linear models for a general-purpose sentence realizer based on de- pendency structures. Unlike traditional realiz- ers using grammar rules, our method realizes sentences by linearizing dependency relations directly in two steps. First, the relative order between head and each dependent is deter- mined by their dependency relation. Then the best linearizations compatible with the relative order are selected by log-linear models. The log-linear models incorporate three types of feature functions, including dependency rela- tions, surface words and headwords. Our ap- proach to sentence realization provides sim- plicity, efficiency and competitive accuracy. Trained on 8,975 dependency structures of a Chinese Dependency Treebank, the realizer achieves a BLEU score of 0.8874.
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
ACM Transactions on Asian Language Information Processing, 2010
This article investigates a relatively underdeveloped subject in natural language processing---th... more This article investigates a relatively underdeveloped subject in natural language processing---the generation of punctuation marks. From a theoretical perspective, we study 16 Chinese punctuation marks as defined in the Chinese national standard of punctuation usage, and categorize these punctuation marks into three different types according to their syntactic properties. We implement a three-tier maximum entropy model incorporating linguistically-motivated features for generating the commonly used Chinese punctuation marks in unpunctuated sentences output by a surface realizer. Furthermore, we present a method to automatically extract cue words indicating sentence-final punctuation marks as a specialized feature to construct a more precise model. Evaluating on the Penn Chinese Treebank data, the MaxEnt model achieves anf-score of 79.83% for punctuation insertion and 74.61% for punctuation restoration using gold data input, 79.50% for insertion and 73.32% for restoration using parse...
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
An important element in question answering systems is the analysis and interpretation of question... more An important element in question answering systems is the analysis and interpretation of questions. Using the NTCIR 5 Cross-Language Question Answering (CLQA) question test set we demonstrate that the accuracy of deep question analysis is dependent on the quantity and suitability of the available linguistic resources. We further demonstrate that applying question analysis tools developed on monolingual training materials to questions translated Chinese-English and English-Chinese using machine translation produces much reduced effectiveness in interpretation of the question. This latter result indicates that question analysis for CLQA should primarily be conducted in the question language prior to translation.
Bookmarks Related papers MentionsView impact
To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it i... more To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it is an open research question how well these approaches migrate to other languages. This paper surveys non-local dependency constructions in Chinese as represented in the Penn Chinese Treebank (CTB) and provides an approach for generating proper predicate-argument-modifier structures including NLDs from surface contextfree phrase structure trees. Our approach recovers non-local dependencies at the level of Lexical-Functional Grammar f-structures, using automatically acquired subcategorisation frames and f-structure paths linking antecedents and traces in NLDs. Currently our algorithm achieves 92.2% f-score for trace insertion and 84.3% for antecedent recovery evaluating on gold-standard CTB trees, and 64.7% and 54.7%, respectively, on CTBtrained state-of-the-art parser output trees.
Bookmarks Related papers MentionsView impact
Guo Yuqing Treebank Based Acquisition of Chinese Lfg Resources For Parsing and Generation Phd Thesis Dublin City University, Nov 1, 2009
Bookmarks Related papers MentionsView impact
Proceedings of the 13th European Workshop on Natural Language Generation, Sep 28, 2011
Bookmarks Related papers MentionsView impact
China-Ireland International Conference on Information and Communications Technologies (CIICT 2007), 2007
Bookmarks Related papers MentionsView impact
Proceedings of the Fifth International Natural Language Generation Conference on - INLG '08, 2008
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236), 2001
Semantic paragraph partition is an important problem in text structure analysis in an automatic a... more Semantic paragraph partition is an important problem in text structure analysis in an automatic abstracting system. For an article containing distinct headings, the paper presents heading models in Chinese text to divide an article into semantic paragraphs based on the recognition of headings. For an article not containing headings, the paper establishes a vector space model for the whole article
Bookmarks Related papers MentionsView impact
This paper describes log-linear models for a general-purpose sentence realizer based on de- pende... more This paper describes log-linear models for a general-purpose sentence realizer based on de- pendency structures. Unlike traditional realiz- ers using grammar rules, our method realizes sentences by linearizing dependency relations directly in two steps. First, the relative order between head and each dependent is deter- mined by their dependency relation. Then the best linearizations compatible with the relative order are selected by log-linear models. The log-linear models incorporate three types of feature functions, including dependency rela- tions, surface words and headwords. Our ap- proach to sentence realization provides sim- plicity, efficiency and competitive accuracy. Trained on 8,975 dependency structures of a Chinese Dependency Treebank, the realizer achieves a BLEU score of 0.8874.
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
ACM Transactions on Asian Language Information Processing, 2010
This article investigates a relatively underdeveloped subject in natural language processing---th... more This article investigates a relatively underdeveloped subject in natural language processing---the generation of punctuation marks. From a theoretical perspective, we study 16 Chinese punctuation marks as defined in the Chinese national standard of punctuation usage, and categorize these punctuation marks into three different types according to their syntactic properties. We implement a three-tier maximum entropy model incorporating linguistically-motivated features for generating the commonly used Chinese punctuation marks in unpunctuated sentences output by a surface realizer. Furthermore, we present a method to automatically extract cue words indicating sentence-final punctuation marks as a specialized feature to construct a more precise model. Evaluating on the Penn Chinese Treebank data, the MaxEnt model achieves anf-score of 79.83% for punctuation insertion and 74.61% for punctuation restoration using gold data input, 79.50% for insertion and 73.32% for restoration using parse...
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact