Gensim:调用docvecs.most_like会产生错误

时间:2018-08-03 20:59:44

标签: python numpy gensim

在文档上调用docvecs.most_similar时,出现错误AttributeError: 'list' object has no attribute 'shape'

# load model from file
from gensim.models.doc2vec import Doc2Vec
model_doc2vec = Doc2Vec.load("/path_to_file/doc2vec.bin")

# attempt to get most similar documents from docvec
tokens = "in space".split()
new_vector = model_doc2vec.infer_vector(tokens)
sims = model_doc2vec.docvecs.most_similar( positive=[new_vector], topn=10 )

产生AttributeError: 'list' object has no attribute 'shape'

我有预感,这可能与numpy和gensim版本兼容性有关。我正在使用Python 3.6,numpy 1.14和gensim 1.0.1。

完整错误:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-37-220db2331e84> in <module>()
----> 1 sims = model_doc2vec.docvecs.most_similar( positive=[new_vector], topn=10 )

~/doc2vec.py in most_similar(self, positive, negative, topn, clip_start, clip_end, indexer)
    436         there was chosen to be significant, such as more popular tag IDs in lower indexes.)
    437         """
--> 438         self.init_sims()
    439         clip_end = clip_end or len(self.doctag_syn0norm)
    440 

~/doc2vec.py in init_sims(self, replace)
    419                         mode='w+', shape=self.doctag_syn0.shape)
    420                 else:
--> 421                     self.doctag_syn0norm = empty(self.doctag_syn0.shape, dtype=REAL)
    422                 np_divide(self.doctag_syn0, sqrt((self.doctag_syn0 ** 2).sum(-1))[..., newaxis], self.doctag_syn0norm)
    423 

AttributeError: 'list' object has no attribute 'shape'

1 个答案:

答案 0 :(得分:0)

RTFD

  

delete_temporary_training_data(keep_doctags_vectors = True,   keep_inference = True)
  丢弃训练和   得分了。如果您确定已完成模型训练,请使用。

     

参数:keep_doctags_vectors(布尔型,可选)–如果设置为False   您不想保存文档标签向量。 在这种情况下,您将不会   能够使用most_similar(),相似性()等方法。 keep_inference   (布尔型,可选)–如果您不想存储参数,则设置为False   用于infer_vector()方法。