Word Embeddings, Analogies, and Machine Learning: Beyond King - Man + Woman = Queen (original) (raw)
Related papers
The (Too Many) Problems of Analogical Reasoning with Word Vectors
This paper explores the possibilities of analogical reasoning with vector space models. Given two pairs of words with the same relation (e.g. man:woman :: king:queen), it was proposed that the offset between one pair of the corresponding word vectors can be used to identify the unknown member of the other pair (king − man + −woman = ?queen). We argue against such " linguistic regularities " as a model for linguistic relations in vector space models and as a benchmark, and we show that the vector offset (as well as two other, better-performing methods) suffers from dependence on vector similarity.
(Presentation) Fitting Semantic Relations to Word Embeddings
Proceedings of the 10th Global Wordnet Conference, 2019
We fit WordNet relations to word embeddings, using 3CosAvg and LRCos, two set-based methods for analogy resolution, and introduce 3CosWeight, a new, weighted variant of 3CosAvg. We test the performance of the resulting semantic vectors in lexicographic semantics tests, and show that none of the tested classifiers can learn symmetric relations like synonymy and antonymy, since the source and target words of these relations are the same set. By contrast, with the asymmetric relations (hyperonymy / hyponymy and meronymy), both 3CosAvg and LRCos clearly outperform the baseline in all cases, while 3CosWeight attained the best scores with hyponymy and meronymy, suggesting that this new method could provide a useful alternative to previous approaches.
Improving relational similarity measurement using symmetries in proportional word analogies
Information Processing & Management, 2013
Measuring the similarity between the semantic relations that exist between words is an important step in numerous tasks in natural language processing such as answering word analogy questions, classifying compound nouns, and word sense disambiguation. Given two word pairs (A, B) and (C, D), we propose a method to measure the relational similarity between the semantic relations that exist between the two words in each word pair. Typically, a high degree of relational similarity can be observed between proportional analogies (i.e. analogies that exist among the four words, A is to B such as C is to D). We describe eight different types of relational symmetries that are frequently observed in proportional analogies and use those symmetries to robustly and accurately estimate the relational similarity between two given word pairs. We use automatically extracted lexical-syntactic patterns to represent the semantic relations that exist between two words and then match those patterns in Web search engine snippets to find candidate words that form proportional analogies with the original word pair. We define 8 types of relational symmetries for proportional analogies and use those as features in a supervised learning approach. We evaluate the proposed method using the Scholastic Aptitude Test (SAT) word analogy benchmark dataset. Our experimental results show that the proposed method can accurately measure relational similarity between word pairs by exploiting the symmetries that exist in proportional analogies. The proposed method achieves an SAT score of 49.2% on the benchmark dataset, which is comparable to the best results reported on this dataset.
On the Information Content of Predictions in Word Analogy Tests
Journal of Communication and Information Systems
An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests. The main component of this approach is a soft accuracy estimator that also yields entropy estimates with compensated biases. Experimental results obtained with pre-trained GloVe 300-D vectors and two public analogy test sets show that proximity hints are much more relevant than analogies in analogy tests, from an information content perspective. Accordingly, a simple word embedding model is used to predict that analogies carry about one bit of information, which is experimentally corroborated.
Geometry and Analogies: A Study and Propagation Method for Word Representations
2019
In this paper we discuss the well-known claim that language analogies yield almost parallel vector differences in word embeddings. On the one hand, we show that this property, while it does hold for a handful of cases, fails to hold in general especially in high dimension, using the best known publicly available word embeddings. On the other hand, we show that this property is not crucial for basic natural language processing tasks such as text classification. We achieve this by a simple algorithm which yields updated word embeddings where this property holds: we show that in these word representations, text classification tasks have about the same performance.
Towards Useful Word Embeddings
RASLAN, 2020
Since the seminal work of Mikolov et al. (2013), word vectors of log-bilinear models have found their way into many nlp applications and were extended with the positional model. Although the positional model improves accuracy on the intrinsic English word analogy task, prior work has neglected its evaluation on extrinsic end tasks, which correspond to real-world nlp applications. In this paper, we describe our first steps in evaluating positional weighting on the information retrieval, text classification, and language modeling extrinsic end tasks.
Embedding Semantic Relations into Word Representations
Learning representations for semantic relations is important for various tasks such as analogy detection , relational search, and relation classification. Although there have been several proposals for learning representations for individual words, learning word representations that explicitly capture the semantic relations between words remains under developed. We propose an unsupervised method for learning vector representations for words such that the learnt representations are sensitive to the semantic relations that exist between two words. First, we extract lexical patterns from the co-occurrence contexts of two words in a corpus to represent the semantic relations that exist between those two words. Second, we represent a lexical pattern as the weighted sum of the representations of the words that co-occur with that lexical pattern. Third, we train a binary classifier to detect relationally similar vs. non-similar lexical pattern pairs. The proposed method is unsupervised in ...
Benchmarking Semantic Capabilities of Analogy Querying Algorithms
Lecture Notes in Computer Science, 2016
Enabling semantically rich query paradigms is one of the core challenges of current information systems research. In this context, due to their importance and ubiquity in natural language, analogy queries are of particular interest. Current developments in natural language processing and machine learning resulted in some very promising algorithms relying on deep learning neural word embeddings which might contribute to finally realizing analogy queries. However, it is still quite unclear how well these algorithms work from a semantic point of view. One of the problems is that there is no clear consensus on the intended semantics of analogy queries. Furthermore, there are no suitable benchmark dataset available respecting the semantic properties of real-life analogies. Therefore, in this, paper, we discuss the challenges of benchmarking the semantics of analogy query algorithms with a special focus on neural embeddings. We also introduce the AGS analogy benchmark dataset which rectifies many weaknesses of established datasets. Finally, our experiments evaluating state-of-theart algorithms underline the need for further research in this promising field.
COMPARATIVE ANALYSIS OF WORD EMBEDDINGS FOR CAPTURING WORD SIMILARITIES
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings. Determining the most qualitative word embeddings is of crucial importance for such models. However, selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans. In this paper, we explore different approaches for creating distributed word representations. We perform an intrinsic evaluation of several state-of-the-art word embedding methods. Their performance on capturing word similarities is analysed with existing benchmark datasets for word pairs similarities. The research in this paper conducts a correlation analysis between ground truth word similarities and similarities obtained by different word embedding methods.
Querying Word Embeddings for Similarity and Relatedness
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018
Word embeddings obtained from neural network models such as Word2Vec Skipgram have become popular representations of word meaning and have been evaluated on a variety of word similarity and relatedness norming data. Skipgram generates a set of word and context embeddings, the latter typically discarded after training. We demonstrate the usefulness of context embeddings in predicting asymmetric association between words from a recently published dataset of production norms (Jouravlev and McRae, 2016). Our findings suggest that humans respond with words closer to the cue within the context embedding space (rather than the word embedding space), when asked to generate thematically related words.