我已经用Gensim训练了我的模型。现在我想用simlexx-999评估我的模型,但这给了我错误。 我的代码。
model.wv.evaluate_word_analogies('SimLex-999.txt')
2019-08-25 13:43:22,766 : INFO : Evaluating word analogies for top 300000 words in the model on SimLex-999.txt
错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-60cb96c45579> in <module>()
----> 1 model.wv.evaluate_word_analogies('SimLex-999.txt')
C:\ProgramData\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py in evaluate_word_analogies(self, analogies, restrict_vocab, case_insensitive, dummy4unknown)
1088 else:
1089 if not section:
-> 1090 raise ValueError("Missing section header before line #%i in %s" % (line_no, analogies))
1091 try:
1092 if case_insensitive:
ValueError: Missing section header before line #0 in SimLex-999.txt
我尝试过
from gensim.test.utils import datapath
similarities = model.evaluate_word_pairs(datapath('SimLex-999.txt'))
print(similarities)
但是它给了我keyError。请帮助我解决问题。
KeyError Traceback (most recent call last)
<ipython-input-29-caeb682cb7ff> in <module>()
1 from gensim.test.utils import datapath
2
----> 3 similarities = model.wv.evaluate_word_pairs(datapath('SimLex-999.txt'),dummy4unknown=True)
4
5 print(similarities)
C:\ProgramData\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py in evaluate_word_pairs(self, pairs, delimiter, restrict_vocab, case_insensitive, dummy4unknown)
1287
1288 """
-> 1289 ok_vocab = [(w, self.vocab[w]) for w in self.index2word[:restrict_vocab]]
1290 ok_vocab = {w.upper(): v for w, v in reversed(ok_vocab)} if case_insensitive else dict(ok_vocab)
1291
C:\ProgramData\Anaconda3\lib\site-packages\gensim\models\keyedvectors.py in <listcomp>(.0)
1287
1288 """
-> 1289 ok_vocab = [(w, self.vocab[w]) for w in self.index2word[:restrict_vocab]]
1290 ok_vocab = {w.upper(): v for w, v in reversed(ok_vocab)} if case_insensitive else dict(ok_vocab)
1291
KeyError: 'movie'
答案 0 :(得分:0)
SimLex-999.txt
似乎不是作为evaluate_word_analogies()
函数的自变量的单词比喻列表。
您是否尝试过evaluate_word_pairs()
函数?其说明位于: