csv read引发“ UnicodeDecodeError:'charmap'编解码器无法解码...”

时间:2019-11-28 06:22:14

标签: python encoding

我已经阅读了所有可以找到的帖子,但是我的处境似乎很独特。我是Python的新手,所以这可能是基本的。我收到以下错误:

  

UnicodeDecodeError:“字符映射”编解码器无法解码位置70的字节0x8d:字符映射为未定义

当我运行代码时:

import csv

input_file = 'input.csv'
output_file = 'output.csv'
cols_to_remove = [4, 6, 8, 9, 10, 11,13, 14, 19, 20, 21, 22, 23, 24]

cols_to_remove = sorted(cols_to_remove, reverse=True)
row_count = 0 # Current amount of rows processed

with open(input_file, "r") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='') as result:
        writer = csv.writer(result)
        for row in reader:
            row_count += 1
            print('\r{0}'.format(row_count), end='')
            for col_index in cols_to_remove:
                del row[col_index]
            writer.writerow(row)

我在做什么错了?

3 个答案:

答案 0 :(得分:0)

  1. 尝试pandas

input_file = pandas.read_csv('input.csv') output_file = pandas.read_csv('output.csv')

  1. 尝试再次将文件另存为CSV UTF-8

答案 1 :(得分:0)

打开文件时添加encoding="utf8"。请尝试以下操作:

with open(input_file, "r", encoding="utf8") as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding="utf8") as result:

答案 2 :(得分:0)

在Python 3中,csv模块将文件作为Unicode字符串处理,因此必须首先对输入文件进行解码。如果知道,则可以使用精确的编码,也可以只使用Latin1,因为它将每个字节映射到具有相同代码点的unicode字符,以便解码+编码使字节值保持不变。您的代码可能变为:

...
with open(input_file, "r", encoding='Latin1') as source:
    reader = csv.reader(source)
    with open(output_file, "w", newline='', encoding='Latin1') as result:
        ...