加载手套字嵌入有编码错误

时间:2017-11-04 11:40:18

标签: python encoding word-embedding

我想从手套加载预训练的单词嵌入,并且有一些编码错误,我尝试过编码,但没有工作,如何解决这个问题?

def load_word_embedding_glove(glovefile):
print ("loading golve word embddings...")
f = open(glovefile, 'r', encoding='utf-8')
model = {}
for line in f:
    splitLine = line.split()
    word = splitLine[0]
    embdding = [float(val) for val in splitLine[1:]]
    model[word] = embdding
print ("glove word embedding is loaded")
return model

然后我打印(模型),发生错误:

filepath = "../Pretrained/glove/glove.6B.100d.txt"
model = load_word_embedding_glove(filepath)
print (model)

错误:

UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 208465: ordinal not in range(128)
顺便说一句,我使用的是python 3。

0 个答案:

没有答案