如何为spacy的Sence2vec实现标记句子

时间:2017-09-24 03:19:13

标签: python nlp spacy sense2vec

SpaCy实施了一个sense2vec字嵌入包,他们记录了here

向量的格式为WORD|POS。例如,句子

Dear local newspaper, I think effects computers have on people are great learning skills/affects because they give us time to chat with friends/new people, helps us learn about the globe(astronomy) and keeps us out of trouble

需要转换为

Dear|ADJ local|ADJ newspaper|NOUN ,|PUNCT I|PRON think|VERB effects|NOUN computers|NOUN have|VERB on|ADP people|NOUN are|VERB great|ADJ learning|NOUN skills/affects|NOUN because|ADP they|PRON give|VERB us|PRON time|NOUN to|PART chat|VERB with|ADP friends/new|ADJ people|NOUN ,|PUNCT helps|VERB us|PRON learn|VERB about|ADP the|DET globe(astronomy|NOUN )|PUNCT and|CONJ keeps|VERB us|PRON out|ADP of|ADP trouble|NOUN !|PUNCT

为了通过sense2vec预训练嵌入来解释,并且为了sense2vec格式。

如何做到这一点?

1 个答案:

答案 0 :(得分:0)

基于SpaCy's bin/merge.py实现,完全满足需要:

print(tag_words_in_sense2vec_format("Dear local newspaper, ..."))

哪里

 Dear|ADJ local|ADJ newspaper|NOUN ,|PUNCT ...

结果

if ( condition ){
// code 
}
else {
// code
}