我有一个像交易一样的数据框
branch daqu from to style color size amount
5 huadong shanghai C30C C30F EEBW52301M 39 165 3
8 huadong shanghai C30F C306 EEBW52301M 51 160 2
2 huadong shanghai C30G C306 EEBW52301M 39 165 10
9 huadong shanghai C30G C30C EEBW52301M 51 170 1
1 huadong shanghai C30G C30F EEBW52301M 39 160 7
7 huadong shanghai C30J C30D EEBW52301M 39 170 2
6 huadong shanghai C30J C30F EEBW52301M 39 170 4
3 huadong shanghai C30K C306 EEBW52301M 39 165 1
0 huadong shanghai C30K C30F EEBW52301M 39 160 7
4 huadong shanghai C30K C30F EEBW52301M 39 165 6
数据意味着我们必须从'从'商店向商店发送'金额'的款式/颜色/尺寸产品。
然后我所做的就是“从'和'到''组合,所以我可以看到每个盒子里放了多少产品。
print dh_final[['from', 'to', 'amount']].groupby(['from', 'to']).sum()
amount
from to
C30C C30F 3
C30F C306 2
C30G C306 10
C30C 1
C30F 7
C30J C30D 2
C30F 4
C30K C306 1
C30F 13
最后,如果从一个商店到另一个商店的商品少于5个产品,我想取消与该商品相关的交易。那就是我必须删除原始数据帧中的行。如果我手动完成,结果应该是这样的。
branch daqu from to style color size amount
2 huadong shanghai C30G C306 EEBW52301M 39 165 10
1 huadong shanghai C30G C30F EEBW52301M 39 160 7
0 huadong shanghai C30K C30F EEBW52301M 39 160 7
4 huadong shanghai C30K C30F EEBW52301M 39 165 6
有没有简单的方法可以做到这一点?如何使用groupby()。sum()的结果来操作原始数据帧?
答案 0 :(得分:1)
如果我理解你的话你想要这个:
In [53]:
df['sum'] = df.groupby(['from', 'to'])['amount'].transform('sum')
df[df['sum'] > 5]
Out[53]:
branch daqu from to style color size amount sum
2 huadong shanghai C30G C306 EEBW52301M 39 165 10 10
1 huadong shanghai C30G C30F EEBW52301M 39 160 7 7
0 huadong shanghai C30K C30F EEBW52301M 39 160 7 13
4 huadong shanghai C30K C30F EEBW52301M 39 165 6 13
所以我在transform
对象上调用groupby
来返回与原始df对齐的系列,以添加' sum'我可以像往常一样过滤df。
修改强>
实际上我认为你可以做到这一点:
In [67]:
df[df.groupby(['from', 'to'])['amount'].transform('sum') > 5]
Out[67]:
branch daqu from to style color size amount
2 huadong shanghai C30G C306 EEBW52301M 39 165 10
1 huadong shanghai C30G C30F EEBW52301M 39 160 7
0 huadong shanghai C30K C30F EEBW52301M 39 160 7
4 huadong shanghai C30K C30F EEBW52301M 39 165 6