Huy Nguyen | University of Kent (original) (raw)

Huy Nguyen

Uploads

Papers by Huy Nguyen

Research paper thumbnail of Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web

We describe a machine learning system for the recognition of names in biomedical texts. The syste... more We describe a machine learning system for the recognition of names in biomedical texts. The system makes extensive use of local and syntactic features within the text, as well as external resources including the web and gazetteers. It achieves an Fscore of 70% on the Coling 2004 NLPBA/BioNLP shared task of identifying five biomedical named entities in the GENIA corpus.

Research paper thumbnail of Named Entity Recognition with Character-Level Models

We discuss two named-entity recognition models which use characters and character ¤ -grams either... more We discuss two named-entity recognition models which use characters and character ¤ -grams either exclusively or as an important part of their data representation. The first model is a character-level HMM with minimal context information, and the second model is a maximum-entropy conditional markov model with substantially richer context features. Our best model achieves an overall F¥ of 86.07% on the English test data (92.31% on the development data). This number represents a 25% error reduction over the same model without word-internal (substring) features. 85.44 90.09 80.95 76.40 89.66 More Sequence £

Research paper thumbnail of Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web

We describe a machine learning system for the recognition of names in biomedical texts. The syste... more We describe a machine learning system for the recognition of names in biomedical texts. The system makes extensive use of local and syntactic features within the text, as well as external resources including the web and gazetteers. It achieves an Fscore of 70% on the Coling 2004 NLPBA/BioNLP shared task of identifying five biomedical named entities in the GENIA corpus.

Research paper thumbnail of Named Entity Recognition with Character-Level Models

We discuss two named-entity recognition models which use characters and character ¤ -grams either... more We discuss two named-entity recognition models which use characters and character ¤ -grams either exclusively or as an important part of their data representation. The first model is a character-level HMM with minimal context information, and the second model is a maximum-entropy conditional markov model with substantially richer context features. Our best model achieves an overall F¥ of 86.07% on the English test data (92.31% on the development data). This number represents a 25% error reduction over the same model without word-internal (substring) features. 85.44 90.09 80.95 76.40 89.66 More Sequence £

Log In