String to byte conversion error in copying csv file

时间:2018-02-26 17:57:59

标签: python python-3.x csv

I wrote the following code in Python 3.6.4 to read a .csv file "Data.csv" in my directory and to copy the contents to a NamedTemporaryFile "temp_file"

file_path= "Data.csv"
temp_file=NamedTemporaryFile(delete=False)
with open(file_path,"rb") as csvfile,temp_file:
    fieldnames=["id","Title","Desc","Comments"]
    reader=csv.DictReader(csvfile,fieldnames=fieldnames)
    writer=csv.DictWriter(temp_file,fieldnames=fieldnames)
    for row in reader:
        temp_row={"id":row["id"],"Title":row["Title"],"Desc":row["Desc"],"Comments":row["Comments"],}
        writer.writerow(temp_row)
        print(row)

Executing the above code gave an error in the iterator line(for row in reader): The error said:

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

I searched for a solution and found this answer

But this did not solve my problem as, when I changed "rb" to "rt", I started getting another error on the line

writer.writerow(temp_row)

This error said:

TypeError: a bytes-like object is required, not 'str'

I think this problem occurs only in Python 3 does not affect Python 2

I have already tried methods like encode(), decode() function to convert String to bytes and also some library functions like json.dumps().

1 个答案:

答案 0 :(得分:2)

您以二进制模式打开文件:

with open(file_path,"rb") as csvfile,temp_file:
    #                ^^ b is for binary

不要那样做。错误消息告诉您不应该这样做,它会特别提醒您以文本模式打开文件:

(did you open the file in text mode?)

从文件模式中删除b

NamedTemporaryFile() class也默认使用'w+b'的二进制模式,您需要告诉它以文本模式打开文件;使用'w+''w'

更正后的代码:

temp_file = NamedTemporaryFile(mode='w+', delete=False)
with open(file_path, "r") as csvfile, temp_file:

在Python 2中,csv模块只能处理二进制模式下的数据,部分原因是Python在首次创建模块时没有强大的Unicode支持,而且因为在Python 2中,某些平台上的文本模式文件以与CSV标准不兼容的方式翻译行结尾。

在Python 3中,strongly recommended在打开文件时通过添加newline=''来关闭行结尾翻译:

temp_file = NamedTemporaryFile(mode='w+', newline='', delete=False)
with open(file_path, "r", newline='') as csvfile, temp_file:

我还强烈考虑明确设置编码,而不是依赖于系统默认值。

作为旁注,在复制字典时无需拼出所有列名,甚至不必为DictReader()DictWriter创建每行的新字典fieldnames列出匹配项。您可以直接将阅读器传递给DictWriter.writerows()方法:

file_path = "Data.csv"
fieldnames = ["id", "Title", "Desc", "Comments"]
temp_file = NamedTemporaryFile(mode='w+', newline='', delete=False)
with open(file_path, "r", newline='') as csvfile, temp_file:
    reader = csv.DictReader(csvfile, fieldnames=fieldnames)
    writer = csv.DictWriter(temp_file, fieldnames=fieldnames)
    writer.writerows(reader)

此时我想知道您是否只是想使用shutil.copyfileob()并且只是避免解析为字典然后序列化为CSV。