我已经阅读了所有可以找到的帖子,但是我的处境似乎很独特。我是Python的新手,所以这可能是基本的。我收到以下错误:
UnicodeDecodeError:“字符映射”编解码器无法解码位置70的字节0x8d:字符映射为未定义
当我运行代码时:
import csv
input_file = 'input.csv'
output_file = 'output.csv'
cols_to_remove = [4, 6, 8, 9, 10, 11,13, 14, 19, 20, 21, 22, 23, 24]
cols_to_remove = sorted(cols_to_remove, reverse=True)
row_count = 0 # Current amount of rows processed
with open(input_file, "r") as source:
reader = csv.reader(source)
with open(output_file, "w", newline='') as result:
writer = csv.writer(result)
for row in reader:
row_count += 1
print('\r{0}'.format(row_count), end='')
for col_index in cols_to_remove:
del row[col_index]
writer.writerow(row)
我在做什么错了?
答案 0 :(得分:0)
pandas
input_file = pandas.read_csv('input.csv')
output_file = pandas.read_csv('output.csv')
答案 1 :(得分:0)
打开文件时添加encoding="utf8"
。请尝试以下操作:
with open(input_file, "r", encoding="utf8") as source:
reader = csv.reader(source)
with open(output_file, "w", newline='', encoding="utf8") as result:
答案 2 :(得分:0)
在Python 3中,csv模块将文件作为Unicode字符串处理,因此必须首先对输入文件进行解码。如果知道,则可以使用精确的编码,也可以只使用Latin1,因为它将每个字节映射到具有相同代码点的unicode字符,以便解码+编码使字节值保持不变。您的代码可能变为:
...
with open(input_file, "r", encoding='Latin1') as source:
reader = csv.reader(source)
with open(output_file, "w", newline='', encoding='Latin1') as result:
...