Sense embedding
word2vec
Glove
CBOW
Skip-gram
words can have multiple meanings
capture different meanings of the same word
Unsupervised
knowledge-based
two-stage models
joint training
contextualized embedding
Pros: address the knowledge-acquisition bottleneck for
sense annotated data
clustering the context where an ambiguous word occurs (P7)
Cons: computational expensiveness
Cons: clustering and sense representation
are done independently
Pros: efficient and unified nature
Cons: assume a fixed number of sense per word
Cons: conditioned on the word embedding of its context
To address, the limitations above,
Dynamic polysemy, pure sense-based models
The above approaches are difficult
to be integrated in downstream models
change depending on the context where they appear
the embedding are the internal states of RNN
knowledge resources: wordnet, wikipedia, freebase, wikidata, dbpedia, babblenet, conceptnet, PPDB
knowledge-enhanced word representation
combine text corpora with lexical resources
improve semantic coherence or coverage of
existing word embedding
useful in the construction of multilingual vector spaces
Knowledge-based sense representation
knowledge-based concept and entity representation
Evaluation
Intrinsic
extrinsic