Incorporating an Error Corpus into a Spellchecker for Maltese (original) (raw)

A Spell Checker for a World Language: The New Microsoft's Spanish Spell Checker

lrec-conf.org

This paper reports work carried out to develop a speller for Spanish at Microsoft Corporation, discusses the technique for isolatedword error correction used by the speller, provides general descriptions of the error data collection and error typology, and surveys a variety of linguistic considerations relevant when dealing with a world language spread over several countries and exposed to different language influences. We show that even though it has been claimed that the state of the art for practical applications based on isolated word error correction does not offer always a sensible set of ranked candidates for the misspelling, the introduction of a finer-grained categorization of errors and the use of their relative frequency has had a positive impact in the speller application developed for Spanish (the corresponding evaluation data is presented).

XUXEN: A Spelling Checker/Corrector for

This paper describes the application of two-levelmorphology to Basque, along with its use in the elaborationof the XUXEN spelling checker/corrector. Themorphological analyzer included in XUXEN has beendesigned with the aim of laying the foundations for furtherdevelopment of automatic processing of Basque. The factthat Basque is a highly inflected language makes thecorrection of spelling errors extremely difficult becausecollecting all the possible word-forms in a lexicon is anendless task

Creation and Evaluation of Extensible Language Resources for Maltese

2002

The creation of Language Resources is a labour intensive process whose difficulty is further compounded when minority languages are concerned (Cunningham, 1999). This paper discusses the creation of an extensible set of Language Resources for Maltese develope d by the Maltilex Project at the University of Malta (Rosner et. al., 1999), together with quality evaluation mechanisms for mino rity

Determiner errors in Basque: Analysis and automatic detection

2009

Resumen: En este artículo presentamos un estudio realizado para analizar el uso incorrecto de los determinantes en textos escritos en euskera. El análisis exhaustivo de esta tipología de errores (a través de los ejemplos recopilados) ha sido la base para la detección automática de los mismos. La recopilación y el análisis de errores son imprescindibles para el desarrollo de un corrector gramatical para el euskera y para la creación de sistemas inteligentes de enseñanza de lenguas asistida por ordenador (ICALL). Palabras clave: Análisis de errores, errores en determinantes, detección automática.

Korektor – A System for Contextual Spell-Checking and Diacritics Completion

2012

We present Korektor - a flexible and powerful purely statistical text correction tool for Czech that goes beyond a traditional spell checker. We use a combination of several language models and an error model to offer the best ordering of correction proposals and also to find errors that cannot be detected by simple spell checkers, namely spelling errors that happen to be homographs of existing word forms. Our system works also without any adaptation as a diacritics generator with the best reported results for Czech text. The design of Korektor contains no language-specific parts other than trained statistical models, which makes it highly suitable to be trained for other languages with available resources. The evaluation demonstrates that the system is a state-of-the-art tool for Czech, both as a spell checker and as a diacritics generator. We also show that these functions combine into a potential aid in the error annotation of a learner corpus of Czech.

XUXEN: A spelling checker/corrector for Basque based on Two-Level Morphology

1992

Abstract The application of the formalism of two-level morphology to Basque and its use in the elaboration of the XUXEN spelling checker/corrector are described. This application is intended to cover a large part of the language. Because Basque is a highly inflected language, the approach of spelling checking and correction has been conceived as a by-product of a general purpose morphological analyzer/generator. This analyzer is taken as a basic tool for current and future work on automatic processing of Basque.