Question

我有一个csv文件（yy.csv），如下所示：

term,country,score,week
aa,USA,26.17,11/22/15-11/28/15
bb,USA,16.5,11/15/15-11/21/15
cc,UK,31.36,11/22/15-11/28/15
dd,UK,21.24,11/15/15-11/21/15
ee,FR,19.2,11/22/15-11/28/15

我还有一个名为country_list的列表：

country_list=['USA','UK','FR']

如果国家/地区列表中的国家/ = col2和周== 11/22 / 15-11 / 28/15，我正在循环浏览国家/地区列表中每个国家/地区的CSV文件以获取期限值。

这是我的代码：

with open("yy.csv") as infile:
    reader = csv.reader(infile)
    next(reader, None)
    for index,country in enumerate(country_list):
        print country
        last_week_dict[country] = []
        for reader_row in reader:
            if ((reader_row[1] == country) and (reader_row[3] == "11/22/15-11/28/15")):
                last_week_dict[country].append(reader_row[0])
            else:
                continue

        print last_week_dict[country]

我应该从print语句获得输出：

last_week_dict['USA']=['aa']
last_week_dict['UK']=['cc']
last_week_dict['FR']=['ee']

但是我只将值附加到USA键：

last_week_dict['USA']=['aa']
last_week_dict['UK']=[]
last_week_dict['FR']=[]

可能是因为当我遍历csv文件时，它在通过美国之后不会从文件顶部开始吗？

Answer 1

您的怀疑是正确的：reader对象是迭代器，迭代器只能在正向读取。它们无法重置。

您可以避免从文件中执行两个循环 - 您的目标似乎是确保获取特定国家/地区的数据。但是，您可以确保在单个循环中发生：

import collections
import csv

# If this were a much longer collection of items it
# would be better to use a set than a list.
country_list = ['USA', 'UK', 'FR']

# The defauldict(list) allows us to just start appending
# rather than checking whether the key is already in the
# dict.
last_week_dict = collections.defaultdict(list)

with open('yy.csv') as infile:
    reader = csv.reader(infile)
    __ = next(reader)  # Skip the header
    for term, country, score, week in reader:
        if country not in country_list:
            continue
        if week != '11/22/15-11/28/15':
            continue
        last_week_dict[country].append(term)

print(last_week_dict)

python如何从顶部循环几次csv？

1 个答案: