这是我目前的代码:
# -*- coding: utf-8 -*-
import csv
import codecs
# original directory
phys_comp_dir = '/Users/lmnt74/Physician_Compare'
# for row in Performance_Scores:
# print(','.join(row))
# file name
National_Downloadable_File = ('/Physician_Compare_National_Downloadable'
'_File.csv')
National_File = csv.reader(open(phys_comp_dir+National_Downloadable_File,
newline='', encoding='utf-8'),
quotechar='|', quoting=csv.QUOTE_MINIMAL,
lineterminator='\n'
)
for row in National_File:
for i in row:
try:
print(i)
except UnicodeError:
print(i.encode('latin-1').decode('utf-8'))
我收到以下错误:
Traceback (most recent call last):
File "/Users/lmn74/Physician_Compare/q2.py", line 41, in <module>
print(i)
UnicodeEncodeError: 'ascii' codec can't encode character '\xae' in position 52: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/lmnt74/Physician_Compare/q2.py", line 43, in <module>
print(i.encode('latin-1').decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 52: invalid start byte
我不确定如何继续。我知道抛出错误的字符串是注册商标(R)。我想弄清楚如何重新编写我的代码,以便它能够在每个字符串中检查这一点,或者如果在最初读取文件时为此分配更好的方法,我就是为了这个。
到目前为止我做了什么:
这些都没有帮助我,也没有足够的读取让我理解。我是一个相当新的初学者,任何指出我正确方向的事都将非常感激。
编辑:想出来,见下文:
改变:
print(i.encode('latin-1').decode('utf-8'))
为:
print(i.encode('ascii', 'ignore').decode('utf-8', 'ignore'))
很抱歉浪费任何人的时间。