在Python 3中未明确打开文件时,将字节转换为字符串

时间:2016-07-08 11:50:33

标签: python-3.x csv python-requests

我正在使用Requests模块来授权,然后从Web API中提取csv内容,并使其在Python 2.7中正常运行。我现在想在Python 3.5中编写相同的脚本,但遇到一些问题:

"iterator should return strings, not bytes (did you open the file in text mode?)"

requests.get似乎返回字节而不是字符串,这似乎与转移到Python 3.x时看到的编码问题有关。错误在最后一行的第3行引发:next(reader)。在Python 2.7中,这不是问题,因为csv函数是在'wb'模式下处理的。

这篇文章非常相似,但由于我没有直接打开csv文件,我似乎无法强制响应文本以这种方式编码: csv.Error: iterator should return strings, not bytes

countries = ['UK','US','CA']
datelist = [1,2,3,4]
baseurl = 'https://somewebsite.com/exporttoCSV.php'

#--- For all date/cc combinations
for cc in countries:
    for d in datelist:

        #---Build API String with variables
        url = (baseurl + '?data=chart&output=csv' +
               '&dataset=' + d + 
               '&cc=' + cc)

        #---Run API Call and create reader object
        r = requests.get(url, auth=(username, password))
        text = r.iter_lines()
        reader = csv.reader(text,delimiter=',')

        #---Write csv output to csv file with territory and date columns
        with open(cc + '_'+ d +'.csv','wt', newline='') as file:
            a = csv.writer(file)
            a.writerow(['position','id','title','kind','peers','territory','date']) #---Write header line
            next(reader) #---Skip original headers
            for i in reader:
                a.writerow(i +[countrydict[cc]] + [datevalue])

2 个答案:

答案 0 :(得分:1)

如果无法测试您的具体情况,我相信应该通过将text = r.iter_lines()更改为:

来解决
text = [line.decode('utf-8') for line in r.iter_lines()]

这应该将r.iter_lines()读入的每一行从字节串解码为csv.reader可用的字符串

我的测试用例如下:

>>> iter_lines = [b'1,2,3,4',b'2,3,4,5',b'3,4,5,6']
>>> text = [line.decode('utf-8') for line in iter_lines]
>>> text
['1,2,3,4', '2,3,4,5', '3,4,5,6']
>>> reader = csv.reader(text,delimiter=',')
>>> next(reader)
['1', '2', '3', '4']
>>> for i in reader:
...     print(i)
...
['2', '3', '4', '5']
['3', '4', '5', '6']

答案 1 :(得分:1)

某些文件必须以字节形式读入,例如来自Django SimpleUploadedFile,这是一个仅使用字节的测试类。以下是我的测试套件中的一些示例代码,介绍了我如何使用它:

<强> test_code.py

import os
from django.core.files.uploadedfile import SimpleUploadedFile
from django.test import TestCase

class ImportDataViewTests(TestCase):

    def setUp(self):
        self.path = "test_in/example.csv"
        self.filename = os.path.split(self.file)[1]

    def test_file_upload(self):
        with open(self.path, 'rb') as infile:
            _file = SimpleUploadedFile(self.filename, infile.read())

        # now an `InMemoryUploadedFile` exists, so test it as you shall!

<强> prod_code.py

import csv

def import_records(self, infile):
    csvfile = (line.decode('utf8') for line in infile)
    reader = csv.DictReader(csvfile)

    for row in reader:
        # loop through file and do stuff!