Question

我已经阅读了所有可以找到的帖子，但是我的处境似乎很独特。我是Python的新手，所以这可能是基本的。我收到以下错误：

UnicodeDecodeError：“字符映射”编解码器无法解码位置70的字节0x8d：字符映射为未定义

当我运行代码时：

import csv

input_file = 'input.csv'
output_file = 'output.csv'
cols_to_remove = [4, 6, 8, 9, 10, 11,13, 14, 19, 20, 21, 22, 23, 24]

cols_to_remove = sorted(cols_to_remove, reverse=True)
row_count = 0 # Current amount of rows processed

with open(input_file, "r") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            print('\r{0}'.format(row_count), end='')
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)

我在做什么错了？

Answer 1

尝试pandas

input_file = pandas.read_csv('input.csv') output_file = pandas.read_csv('output.csv')

尝试再次将文件另存为CSV UTF-8

Answer 2

打开文件时添加encoding="utf8"。请尝试以下操作：

with open(input_file, "r", encoding="utf8") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding="utf8") as result:

Answer 3

在Python 3中，csv模块将文件作为Unicode字符串处理，因此必须首先对输入文件进行解码。如果知道，则可以使用精确的编码，也可以只使用Latin1，因为它将每个字节映射到具有相同代码点的unicode字符，以便解码+编码使字节值保持不变。您的代码可能变为：

...
with open(input_file, "r", encoding='Latin1') as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding='Latin1') as result:
        ...

csv read引发“ UnicodeDecodeError：'charmap'编解码器无法解码...”

3 个答案: