我正在尝试使用python3.5中的gensim-1.0加载包含西班牙语单词的模型,但是当我gensim.models.KeyedVectors.load_word2vec_format(mymodel)
时,CLI会说:
Traceback (most recent call last):
File "./prueba.py", line 30, in <module>
model = KeyedVectors.load_word2vec_format('./data/WikiModelEsp/wiki.size.800.window.5.mincount.50.new.model', binary=True)
File "/usr/local/lib/python3.5/dist-packages/gensim/models/keyedvectors.py", line 192, in load_word2vec_format
header = utils.to_unicode(fin.readline(), encoding=encoding)
File "/usr/local/lib/python3.5/dist-packages/gensim/utils.py", line 231, in any2unicode
return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
我尝试使用encoding='latin1'
和binary=True
调用加载函数,但仍无法正常工作。
答案 0 :(得分:0)
您是否只尝试过加载功能?像这个: model = KeyedVectors.load(path_model)