word2vec - Map word to embedding vector - MATLAB (original) (raw)
Main Content
Map word to embedding vector
Syntax
Description
[M](#d126e61046) = word2vec([emb](#d126e61000),[words](#d126e61016))
returns the embedding vectors of words
in the embeddingemb
. If a word is not in the embedding vocabulary, then the function returns a row of NaN
values. The function, by default, is case sensitive.
[M](#d126e61046) = word2vec([emb](#d126e61000),[words](#d126e61016),'IgnoreCase',true)
returns the embedding vectors of words
ignoring case using any of the previous syntaxes. If multiple words in the embedding differ only in case, then the function returns the vector corresponding to one of them and does not return any particular vector.
Examples
Load a pretrained word embedding using fastTextWordEmbedding
. This function requires Text Analytics Toolbox™ Model for fastText English 16 Billion Token Word Embedding support package. If this support package is not installed, then the function provides a download link.
emb = fastTextWordEmbedding
emb = wordEmbedding with properties:
Dimension: 300
Vocabulary: [1×1000000 string]
Map the words "Italy", "Rome", and "Paris" to vectors using word2vec
.
italy = word2vec(emb,"Italy"); rome = word2vec(emb,"Rome"); paris = word2vec(emb,"Paris");
Map the vector italy - rome + paris
to a word using vec2word
.
word = vec2word(emb,italy - rome + paris)
Input Arguments
Input words, specified as a string vector, character vector, or cell array of character vectors. If you specify words
as a character vector, then the function treats the argument as a single word.
Data Types: string
| char
| cell
Output Arguments
Matrix of word embedding vectors.
Version History
Introduced in R2017b