Context clustering for word sense disambiguation based on modeling pairwise context similarities (original) (raw)

Word Sense Disambiguation based on Term to Term Similarity in a Context Space

2004

This paper describes the exemplar based approach presented by UNED at Senseval-3. Instead of representing contexts as bags of terms and defining a similarity measure between contexts, we propose to represent terms as bags of contexts and define a similarity measure between terms. Thus, words, lemmas and senses are represented in the same space (the context space), and similarity measures can be defined between them. New contexts are transformed into this representation in order to calculate their similarity to the candidate senses. We show how standard similarity measures obtain better results in this framework. A new similarity measure in the context space is proposed for selecting the senses and performing disambiguation. Results of this approach at Senseval-3 are here reported. 2 Data Each Lexical Sample Task has a relatively large training set with disambiguated examples. The test examples set has approximately a half of the number of the examples in the training data. Each example offers an ambiguous word and its

Word relatives in context for word sense disambiguation

Proceedings of the 2006 Australasian Language Technology Workshop, 2006

The current situation for Word Sense Disambiguation (WSD) is somewhat stuck due to lack of training data. We present in this paper a novel disambiguation algorithm that improves previous systems based on acquisition of examples by incorporating local context information. With a basic configuration, our method is able to obtain state-of-the-art performance. We complemented this work by evaluating other well-known methods in the same dataset, and analysing the comparative results per word. We observed that each ...

Combining Contextual Features for Word Sense Disambiguation

2002

In this paper we present a maximum entropy Word Sense Disambiguation system we developed which performs competitively on SENSEVAL-2 test data for English verbs. We demonstrate that using richer linguistic contextual features significantly improves tagging accuracy, and compare the system's performance with human annotator performance in light of both fine-grained and coarse-grained sense distinctions made by the sense inventory.

Combining Local and Related Context for Word Sense Disambiguation on Specific Domains

Proceedings of the International Conference on Data Technologies and Applications, 2012

In this paper an approach for word sense disambiguation in documents is presented. For Word Sense Disambiguation (WSD), the local and related context for an ambiguous word is extracted, such context is used for retrieve second order vectors from WordNet. Thus two graphs are built at the same time and evaluated individually, finally both results are combined to automatically assign the correct sense for the ambiguous word. The proposed approach was tested on the task #17 of the SemEval 2010 international competition producing promising results compared to other approaches.

Similarity-based Word Sense Disambiguation

We describe a method for automatic word sense disambiguation using a text corpus and a machinereadable dictionary (MRD). The method is based on word similarity and context similarity measures. Words are considered similar if they appear in similar contexts; contexts are similar if they contain similar words. The circularity of this definition is resolved by an iterative, converging process, in which the system learns from the corpus a set of typical usages for each of the senses of the polysemous word listed in the MRD. A new instance of a polysemous word is assigned the sense associated with the typical usage most similar to its context. Experiments show that this method performs well, and can learn even from very sparse training data.

Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources

2018

Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. Despite the recent introduction of Word Embeddings and Recurrent Neural Networks to design powerful context-related features, the interest in improving WSD models using Semantic Lexical Resources (SLRs) is mostly restricted to knowledge-based approaches. In this paper, we enhance “modern” supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We propose an effective way to introduce semantic features into the classifiers, and we consider using the SLR structure to augment the training data. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks, and we extend the proposed model into a novel multi-layer architecture for WSD. A detailed experimental comparison in the recent Unified Evaluation Framewo...

Automatic Word Sense Disambiguation Using Cooccurrence and Hierarchical Information

Lecture Notes in Computer Science, 2010

We review in detail here a polished version of the systems with which we participated in the Senseval-2 competition English tasks (all words and lexical sample). It is based on a combination of selectional preference measured over a large corpus and hierarchical information taken from WordNet, as well as some additional heuristics. We use that information to expand sense glosses of the senses in WordNet and compare the similarity between the contexts vectors and the word sense vectors in a way similar to that used by Yarowsky and Schuetze. A supervised extension of the system is also discussed. We provide new and previously unpublished evaluation over the SemCor collection, which is two orders of magnitude larger than SENSEVAL-2 collections as well as comparison with baselines. Our systems scored first among unsupervised systems in both tasks. We note that the method is very sensitive to the quality of the characterizations of word senses; glosses being much better than training examples.

Evaluating sense disambiguation across diverse parameter spaces

Natural Language Engineering, 2002

This paper presents a comprehensive empirical exploration and evaluation of a diverse range of data characteristics which influence word sense disambiguation performance. It focuses on a set of six core supervised algorithms, including three variants of Bayesian classifiers, a cosine model, non-hierarchical decision lists, and an extension of the transformation-based learning model. Performance is investigated in detail with respect to the following parameters: (a) target language (English, Spanish, Swedish and Basque); (b) part of speech; (c) sense granularity; (d) inclusion and exclusion of major feature classes; (e) variable context width (further broken down by part-of-speech of keyword); (f) number of training examples; (g) baseline probability of the most likely sense; (h) sense distributional entropy; (i) number of senses per keyword; (j) divergence between training and test data; (k) degree of (artificially introduced) noise in the training data; (l) the effectiveness of an ...

A Semantics-Enhanced Language Model for Unsupervised Word Sense Disambiguation

2008

An N-gram language model aims at capturing statistical word order dependency information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.

Automatic Word Sense Disambiguation Using Cooccurrence and Hierarchical

2020

We review in detail here a polished version of the systems with which we participated in the Senseval-2 competition English tasks (all words and lexical sample). It is based on a combination of selectional preference measured over a large corpus and hierarchical information taken from WordNet, as well as some additional heuristics. We use that information to expand sense glosses of the senses in WordNet and compare the similarity between the contexts vectors and the word sense vectors in a way similar to that used by Yarowsky and Schuetze. A supervised extension of the system is also discussed. We provide new and previously unpublished evaluation over the SemCor collection, which is two orders of magnitude larger than SENSEVAL-2 collections as well as comparison with baselines. Our systems scored first among unsupervised systems in both tasks. We note that the method is very sensitive to the quality of the characterizations of word senses; glosses being much better than training exa...