我有四列数据,我试图根据两个条件进行整合。数据格式如下:
CountyName Year Oil Gas
ANDERSON 2010 1358 0
ANDERSON 2010 621746 4996766
ANDERSON 2011 1587 0
ANDERSON 2011 633120 5020877
ANDERSON 2012 55992 387685
ANDERSON 2012 1342 0
ANDERSON 2013 635572 3036578
ANDERSON 2013 4873 0
ANDERSON 2014 656440 2690333
ANDERSON 2014 12332 0
ANDERSON 2015 608454 2836272
ANDERSON 2015 23339 0
ANDERSON 2016 551728 2682261
ANDERSON 2016 12716 0
ANDERSON 2017 132466 567874
ANDERSON 2017 1709 0
ANDREWS 2010 25701725 1860063
ANDREWS 2010 106351 0
ANDREWS 2011 97772 0
ANDREWS 2011 28818329 1377865
ANDREWS 2012 105062 0
...
我有兴趣将重复的条目的相应油和气体值组合在一起。例如,我想在2010年为安德森县添加所有石油条目,并将该值替换为一行中的现有条目。我现在使用的代码是对各个县的所有值进行求和,不管年份如何,给我一个浓缩的输出:
CountyName Year Oil Gas
ANDERSON 3954774
ANDREWS 206472698
...
这是我正在使用的代码:
import csv
with open('Texas.csv', 'r') as Texas: #opening Texas csv file
TexasReader = csv.reader(Texas)
counties = {}
years = {}
index = 0 and 1
for row in TexasReader:
if index == 0 and 1:
header = row
else:
county = row[0]
year = row[1]
oil = row[2]
gas = row[3]
if county in counties:
counties[county] += int(oil)
else:
counties[county] = int(oil)
index += 1
with open('TexasConsolidated.csv', 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=header, delimiter=',', lineterminator='\n')
writer.writeheader()
for k, v in counties.items():
writer.writerow({header[0]: k, header[2]: v})
答案 0 :(得分:0)
这就是你抱怨的行:
if county in counties:
counties[county] += int(oil)
如果您希望dict
存储两个密钥的总和,则两个值都需要位于dict
密钥中。
添加行
counties_years = {}
然后总结如下,使用元组(
县 ,
年 )
作为关键:
if (county,year) in counties_years:
counties_years[(county,year)] += int(oil)
else:
counties_years[(county,year)] = int(oil)