在编写

时间:2016-01-28 04:26:02

标签: python csv python-3.x

预警:我对Python和编程很新。我正在尝试使用Python 3获取一些CSV数据并在将其写入文件之前对其进行一些更改。我的问题在于从变量访问CSV数据,如下所示:

import csv
import requests

csvfile = session.get(url)
reader = csv.reader(csvfile.content)

for row in reader:
    do(something)

返回:

_csv.Error: iterator should return strings, not int (did you open the file in text mode?)

谷歌搜索显示我应该提供读者文本而不是字节,所以我也尝试了:

 reader = csv.reader(csvfile.text)

这也不起作用,因为循环逐字逐句而不是逐行。我还尝试使用TextIOWrapper和类似的选项但没有成功。我设法让这个工作的唯一方法是将数据写入文件,读取它,然后进行更改,如下所示:

csvfile = session.get(url)

with open("temp.txt", 'wb') as f:
    f.write(csvfile.content)

with open("temp.txt", 'rU', encoding="utf8") as data:
    reader = csv.reader(data)
    for row in reader:
        do(something)

我觉得这远非最佳方式,即使它有效。直接从内存中读取和编辑CSV数据的正确方法是什么,而不必将其保存到临时文件中?

1 个答案:

答案 0 :(得分:0)

你不必写一个临时文件,这就是我要做的,使用" csv"和"请求"模块:

import csv
import requests

__csvfilepathname__ = r'c:\test\test.csv'
__url__ = 'https://server.domain.com/test.csv'

def csv_reader(filename, enc = 'utf_8'):
    with open(filename, 'r', encoding = enc) as openfileobject:
        reader = csv.reader(openfileobject)
        for row in reader:
            #do something
            print(row)
    return

def csv_from_url(url):
    line = ''
    datalist = []
    s = requests.Session()
    r = s.get(url)    
    for x in r.text.replace('\r',''):
        if not x[0] == '\n':
            line = line + str(x[0])
        else: 
            datalist.append(line)
            line = ''
    datalist.append(line)
    # at this point you already have a data list 'datalist'
    # no need really to use the csv.reader object, but here goes:
    reader = csv.reader(datalist)
    for row in reader:
        #do something
        print(row)
    return

def main():
    csv_reader(__csvfilepathname__)
    csv_from_url(__url__)
    return

if __name__ == '__main__':
    main ()

不是很漂亮,在内存/性能方面可能不太好,取决于" big"你的csv / data是

HTH,Edwin。