Lexical Disambiguation in LTAG Using Left Context (original) (raw)
Related papers
A l'interface de la phonologie et de la lexicologie
HAL (Le Centre pour la Communication Scientifique Directe), 2022
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Modelling usage information in a legacy dictionary: from TEI Lex-0 to Ontolex-Lemon
2022
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Scholarpedia, 2008
He is co-editor of the CLEF (Cross-Language Evaluation Forum) proceedings on multilingual information access published by Springer. In Country, where he is member of the IXA NLP group. He organized Paul Buitelaar is a Senior Researcher in the Language Technology Lab several international workshops and has been an invited speaker at panels and workshops on topics in semantic annotation and ontology development.
2018
One of the central problems in the semantics of derived words is polysemy (see, for example, the recent contributions by Lieber 2016 and Plag et al. 2018). In this paper, we tackle the problem of disambiguating newly derived words in context by applying Distributional Semantics (Firth 1957) to deverbal-ment nominalizations (e.g. bedragglement, emplacement). We collected a dataset containing contexts of low frequency deverbal-ment nominalizations (55 types, 406 tokens, see Appendix B) extracted from large corpora such as the Corpus of Contemporary American English. We chose low frequency derivatives because high frequency formations are often lexicalized and thus tend to not exhibit the kind of polysemous readings we are interested in. Furthermore, disambiguating low-frequency words presents an especially difficult task because there is little to no prior knowledge about these words from which their semantic properties can be extrapolated. The data was manually annotated according to eventive vs. non-eventive interpretations, allowing also an ambiguous label in those cases where the context did not disambiguate. Our question then was to what extent, and under which conditions, context-derived representations such as those of Distributional Semantics can be successfully employed in the disambiguation of low-frequency derivatives. Our results show that, first, our models are able to distinguish between eventive and non-eventive readings with some success. Second, very small context windows are sufficient to find the intended interpretation in the majority of cases. Third, ambiguous instances tend to be classified as events. Fourth, the performance of the classifier differed for different subcategories of nouns, with non-eventive derivatives being harder to classify correctly. We present indirect evidence that this is due to the semantic similarity of abstract non-eventive nouns to eventive nouns. Overall, this paper demonstrates that distributional semantic models can be fruitfully employed for the disambiguation of low frequency words in spite of the scarcity of available contextual information.
Corpus oraux, guide des bonnes pratiques 2006
2006
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
HAL (Le Centre pour la Communication Scientifique Directe), 2002
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Word sense disambiguation in multilingual contexts
2017
espanolLa desambiguacion del sentido de las palabras se define como el proceso de identificacion del sentido que adopta una palabra polisemica, es decir, con varios significados posibles, en el contexto concreto de una oracion. Debido a la necesidad de definir sin ambiguedad posible el significado de todas las palabras de un texto para que un sistema automatico pueda entenderlo y trabajar con el, la desambiguacion semantica representa un aspecto crucial y transversal a cualquier tarea dentro del Procesamiento del Lenguaje Natural. La investigacion realizada en esta tesis doctoral se centra en la desambiguacion semantica en escenarios en los que existe la posibilidad de utilizar textos escritos en diversos idiomas. Dentro de estos escenarios, dividimos la tesis en dos grandes campos, en funcion de las tareas especificas de desambiguacion a las que nos enfrentamos: desambiguacion bilingue del sentido de las palabras, y desambiguacion multilingue en el dominio biomedico. En la primera ...
Complex structures and semantics in free word association
HAL (Le Centre pour la Communication Scientifique Directe), 2012
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.