应用错误收集

对于spamreader中的行：UnicodeEncodeError：'ascii'编解码器无法对位置89中的字符u'\ u2013'进行编码：序号不在范围内（128）

时间：2017-11-08 13:33:43

标签： python python-2.7 unicode encoding utf-8

我在Python 2.7中使用以下代码来从utf-8中编码的csv文件中读取数据。我正在使用codecs.open和encoding=utf-8来阅读文本文件，但是，我仍然遇到同样的问题。

import csv
import gensim
import nltk
from nltk.corpus import stopwords
import codecs

...

with codecs.open('data_techsupport.csv', encoding='utf-8', errors='ignore') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='"')
    for row in spamreader:
        vectors.append([])
        x=list(row)
        x[0].encode('utf8')
        sentence=nltk.word_tokenize(x[0])

...

我得到的错误是：

for row in spamreader:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 89: ordinal not in range(128)

有人可以帮忙吗？

0 个答案:

没有答案