Pandas groupby仅根据2组选择两个值并将rest转换为0

时间:2018-01-10 21:44:23

标签: python pandas pandas-groupby

这是我在这里回答的问题:Pandas groupby selecting only one value based on 2 groups and converting rest to 0

我有一个pandas数据框,其日期时间索引如下所示:

df =

           Fruit    Quantity
01/02/10    Apple   4
01/02/10    Apple   6
01/02/10    Apple   12
01/02/10    Pear    7
01/02/10    Grape   8
01/02/10    Grape   5
02/02/10    Apple   2
02/02/10    Fruit   6
02/02/10    Pear    8
02/02/10    Pear    5
02/02/10    Apple   2
02/02/10    Apple   2

现在,对于每个日期和每个水果,我只想要两个值(最好是前两个)和日期剩下的水果保持为零。所以期望的输出如下:

           Fruit    Quantity
01/02/10    Apple   4
01/02/10    Apple   6
01/02/10    Apple   0
01/02/10    Pear    7
01/02/10    Grape   8
01/02/10    Grape   5
02/02/10    Apple   2
02/02/10    Fruit   6
02/02/10    Pear    8
02/02/10    Pear    5
02/02/10    Apple   2
02/02/10    Apple   0

这只是一个小例子,但我的主数据框有超过300万行,并且每个日期的结果不一定是正确的。

由于

1 个答案:

答案 0 :(得分:2)

cumcountdate(index)分组Fruit,然后将计数大于1的行归零:

df['QuanityTrimmed'] = df.Quantity.where(df.groupby([df.index, df.Fruit]).cumcount() < 2, 0)

print(df)
#          Fruit  Quantity  QuanityTrimmed
#01/02/10  Apple         4               4
#01/02/10  Apple         6               6
#01/02/10  Apple        12               0
#01/02/10   Pear         7               7
#01/02/10  Grape         8               8
#01/02/10  Grape         5               5
#02/02/10  Apple         2               2
#02/02/10  Fruit         6               6
#02/02/10   Pear         8               8
#02/02/10   Pear         5               5
#02/02/10  Apple         2               2
#02/02/10  Apple         2               0