SpaCy实施了一个sense2vec字嵌入包,他们记录了here
向量的格式为WORD|POS
。例如,句子
Dear local newspaper, I think effects computers have on people are great learning skills/affects because they give us time to chat with friends/new people, helps us learn about the globe(astronomy) and keeps us out of trouble
需要转换为
Dear|ADJ local|ADJ newspaper|NOUN ,|PUNCT I|PRON think|VERB effects|NOUN computers|NOUN have|VERB on|ADP people|NOUN are|VERB great|ADJ learning|NOUN skills/affects|NOUN because|ADP they|PRON give|VERB us|PRON time|NOUN to|PART chat|VERB with|ADP friends/new|ADJ people|NOUN ,|PUNCT helps|VERB us|PRON learn|VERB about|ADP the|DET globe(astronomy|NOUN )|PUNCT and|CONJ keeps|VERB us|PRON out|ADP of|ADP trouble|NOUN !|PUNCT
为了通过sense2vec预训练嵌入来解释,并且为了sense2vec格式。
如何做到这一点?
答案 0 :(得分:0)
基于SpaCy's bin/merge.py实现,完全满足需要:
print(tag_words_in_sense2vec_format("Dear local newspaper, ..."))
哪里
Dear|ADJ local|ADJ newspaper|NOUN ,|PUNCT ...
结果
if ( condition ){
// code
}
else {
// code
}