Alternative measures of word relatedness in distributional semantics (original) (raw)

This paper presents an alternative method to measuring word-word semantic relatedness in distributional semantics framework. The main idea is to represent target words as rankings of all co-occurring words in a text corpus, ordered by their tf – idf weight and use a metric between rankings (such as Jaro distance or Rank distance) to compute semantic relatedness. This method has several advantages over the standard approach that uses cosine measure in a vector space, mainly in that it is computationally less expensive (i.e. does not require working in a high dimensional space, employing only rankings and a distance which is linear in the rank’s length) and presumably more robust. We tested this method on the standard WS353 Test, obtaining the co-occurrence frequency from the Wacky corpus. The results are comparable to the methods which use vector space models; and, most importantly, the method can be extended to the very challenging task of measuring phrase semantic relatedness.

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.