Tagging L2 Writing: Learner Errors and the Performance of an Automated Part-of-Speech Tagger (original) (raw)

GEMA Online® Journal of Language Studies

This paper is concerned with the application of technologies developed in other disciplines, in particular with the use of text processing techniques to investigate the problems of second language learner writing in English. The question addressed is whether learner texts produced by L1-Malay learners at the University of Malaya can usefully be processed using the Constituent Likelihood Automatic Word-tagging System (CLAWS); a part-of-speech (POS) tagger developed for and trained on texts written by native speakers of the language. The study adopts the procedure employed by van Rooy and Schäfer (2002).CLAWS was used to automatically POS tag a subset of the Malaysian Corpus of Learner English (MACLE), and the texts were then analyzed for tagging accuracy.CLAWS was found to perform less well on learner text than on native speaker texts, but still with an accuracy rate of over 90%. The sources of error are traced, and spelling errors are found to be the most common source. Closer inspection indicates that successful tagging is likely to lead to problems downstream in later processing, which suggests that to optimize performance, some modifications will be required in tagger design.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Tagging L2 Writing: Learner Errors and the Performance of an Automated Part-of-Speech Tagger (original) (raw)

Sign up for access to the world's latest research.

Related papers

Related topics