无法在python 3中使用csv.reader作为非ascii字符串

时间:2018-02-04 00:12:08

标签: python python-3.x csv unicode

这是我目前的代码:

# -*- coding: utf-8 -*-
import csv
import codecs

# original directory
phys_comp_dir = '/Users/lmnt74/Physician_Compare'

# for row in Performance_Scores:
#     print(','.join(row))

# file name
National_Downloadable_File = ('/Physician_Compare_National_Downloadable'
                              '_File.csv')
National_File = csv.reader(open(phys_comp_dir+National_Downloadable_File,
                                newline='', encoding='utf-8'),
                           quotechar='|', quoting=csv.QUOTE_MINIMAL,
                           lineterminator='\n'
                           )

for row in National_File:
    for i in row:
        try:
            print(i)
        except UnicodeError:
            print(i.encode('latin-1').decode('utf-8'))

我收到以下错误:

Traceback (most recent call last):
  File "/Users/lmn74/Physician_Compare/q2.py", line 41, in <module>
    print(i)
UnicodeEncodeError: 'ascii' codec can't encode character '\xae' in position 52: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/lmnt74/Physician_Compare/q2.py", line 43, in <module>
    print(i.encode('latin-1').decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 52: invalid start byte

我不确定如何继续。我知道抛出错误的字符串是注册商标(R)。我想弄清楚如何重新编写我的代码,以便它能够在每个字符串中检查这一点,或者如果在最初读取文件时为此分配更好的方法,我就是为了这个。

到目前为止我做了什么:

这些都没有帮助我,也没有足够的读取让我理解。我是一个相当新的初学者,任何指出我正确方向的事都将非常感激。

编辑:想出来,见下文:

改变:

print(i.encode('latin-1').decode('utf-8'))

为:

print(i.encode('ascii', 'ignore').decode('utf-8', 'ignore'))

很抱歉浪费任何人的时间。

0 个答案:

没有答案