Pandas groupby和sum没有给出正确的值

时间:2017-07-20 02:03:19

标签: python pandas

我想按国家/地区按年份对数据进行分组,并使用pandas汇总值列。目前我正在阅读csv文件并使用以下内容:

data_cleaned= df.groupby(['Country', 'year'], as_index=False).sum()

以下是我的数据集示例:

Country year  value
Angola  2009    0
Angola  2009    0
Angola  2010    0
Angola  2010    0
Angola  2010    0
Angola  2010    0
Angola  2011    0
Angola  2011    0
Angola  2011    0
Angola  2011    0
Angola  2012    118
Angola  2012    0
Angola  2012    0
Angola  2012    0
Angola  2013    0
Angola  2013    0
Angola  2013    0
Angola  2013    0
Angola  2014    0
Angola  2014    0
Angola  2014    0
Angola  2014    0
Angola  2015    0
Angola  2015    0
Angola  2015    0
Angola  2015    0
Angola  2016    0
Angola  2016    0
Angola  2016    0
Angola  2016    0
Angola  2017    0
Australia   2009    0
Australia   2009    14
Australia   2009    0
Australia   2009    12
Australia   2010    0
Australia   2010    0
Australia   2010    54
Australia   2010    6
Australia   2011    0
Australia   2011    4
Australia   2011    17
Australia   2011    13
Australia   2012    8
Australia   2012    2
Australia   2012    4
Australia   2012    105
Australia   2013    0
Australia   2013    5
Australia   2013    0
Australia   2013    0
Australia   2014    0
Australia   2014    0
Australia   2014    0
Australia   2014    0
Australia   2015    0
Australia   2015    0
Australia   2015    0
Australia   2015    0
Australia   2016    0
Australia   2016    0
Australia   2016    0
Australia   2016    0
Australia   2017    0

但我得到以下结果:

Partner Country year    value
0   Angola  2009    0.00
1   Angola  2010    0.00
2   Angola  2011    0.00
3   Angola  2012    86,280.00
4   Angola  2013    0.00
5   Angola  2014    0.00
6   Angola  2015    0.00
7   Angola  2016    0.00
8   Angola  2017    0.00
9   Australia   2009    54,879.00
10  Australia   2010    67,899.00
11  Australia   2011    50,965.00
12  Australia   2012    332,128.00
13  Australia   2013    16,515.00
14  Australia   2014    0.00
15  Australia   2015    0.00
16  Australia   2016    0.00
17  Australia   2017    0.00

这显然是错误的,因为安哥拉在2012年只有一个非零价值,这是正确的年份,但是我预计会有118而不是86,280.00。有人可能会指出我做错了什么以及如何根据国家和年份列正确地对值列进行求和?

0 个答案:

没有答案