我有一个csv文件(yy.csv),如下所示:
term,country,score,week
aa,USA,26.17,11/22/15-11/28/15
bb,USA,16.5,11/15/15-11/21/15
cc,UK,31.36,11/22/15-11/28/15
dd,UK,21.24,11/15/15-11/21/15
ee,FR,19.2,11/22/15-11/28/15
我还有一个名为country_list的列表:
country_list=['USA','UK','FR']
如果国家/地区列表中的国家/ = col2和周== 11/22 / 15-11 / 28/15,我正在循环浏览国家/地区列表中每个国家/地区的CSV文件以获取期限值。
这是我的代码:
with open("yy.csv") as infile:
reader = csv.reader(infile)
next(reader, None)
for index,country in enumerate(country_list):
print country
last_week_dict[country] = []
for reader_row in reader:
if ((reader_row[1] == country) and (reader_row[3] == "11/22/15-11/28/15")):
last_week_dict[country].append(reader_row[0])
else:
continue
print last_week_dict[country]
我应该从print
语句获得输出:
last_week_dict['USA']=['aa']
last_week_dict['UK']=['cc']
last_week_dict['FR']=['ee']
但是我只将值附加到USA键:
last_week_dict['USA']=['aa']
last_week_dict['UK']=[]
last_week_dict['FR']=[]
可能是因为当我遍历csv文件时,它在通过美国之后不会从文件顶部开始吗?
答案 0 :(得分:2)
您的怀疑是正确的:reader
对象是迭代器,迭代器只能在正向读取。它们无法重置。
您可以避免从文件中执行两个循环 - 您的目标似乎是确保获取特定国家/地区的数据。但是,您可以确保在单个循环中发生:
import collections
import csv
# If this were a much longer collection of items it
# would be better to use a set than a list.
country_list = ['USA', 'UK', 'FR']
# The defauldict(list) allows us to just start appending
# rather than checking whether the key is already in the
# dict.
last_week_dict = collections.defaultdict(list)
with open('yy.csv') as infile:
reader = csv.reader(infile)
__ = next(reader) # Skip the header
for term, country, score, week in reader:
if country not in country_list:
continue
if week != '11/22/15-11/28/15':
continue
last_week_dict[country].append(term)
print(last_week_dict)