我有一个数据集如下:
Time Sent Contract B/S Price Qty
9 10:05:46 815 A BUY 0.55 154200
11 10:08:47 988 A SELL 0.56 154200
113 10:20:52 823 B BUY 0.39 505000
114 14:33:59 424 B SELL 0.39 505000
31 11:31:44 657 C BUY 0.92 201000
32 11:36:54 947 C SELL 0.92 201000
33 11:42:52 228 C BUY 0.92 179300
我希望在这里实现的是总和数量IF和仅在所有其他列匹配时。在这种情况下,所需的输出是
function findMaxOccurence(ar){
ar.sort().reverse() // Reverses a sorted array Max to min
count = 0;
for(i=0;i<ar.length;i++){
++count
if(i == ar.length - 1){//break out when last element reached
break
}
if(ar[i+1] != ar[i]){
break
}
}
return count
}
我对数据框的布局非常满意,并且不想使用会破坏当前订单的df.groupby()。另请注意,第一列是原始索引位置,我还没有重置。
任何帮助将不胜感激。谢谢!
答案 0 :(得分:1)
您需要先从first
创建列,然后按agg
与[{1}}合并index
,sum
合并Qty
列:
df = (df.reset_index()
.groupby(['Time Sent', 'Contract', 'B/S', 'Price'], as_index=False, sort=False)
.agg({'index':'first', 'Qty':'sum'})
.set_index('index')
.rename_axis(None))
print (df)
Time Sent Contract B/S Price Qty
9 10:05:46 815 A BUY 0.55 154200
11 10:08:47 988 A SELL 0.56 154200
113 10:20:52 823 B BUY 0.39 505000
114 14:33:59 424 B SELL 0.39 505000
31 11:31:44 657 C BUY 0.92 201000
32 11:36:54 947 C SELL 0.92 201000
33 11:42:52 228 C BUY 0.92 179300
如果索引中的值不是必需的,则应重置:
df=df.groupby(['Time Sent','Contract','B/S','Price'],as_index=False,sort=False)['Qty'].sum()
print (df)
Time Sent Contract B/S Price Qty
0 10:05:46 815 A BUY 0.55 154200
1 10:08:47 988 A SELL 0.56 154200
2 10:20:52 823 B BUY 0.39 505000
3 14:33:59 424 B SELL 0.39 505000
4 11:31:44 657 C BUY 0.92 201000
5 11:36:54 947 C SELL 0.92 201000
6 11:42:52 228 C BUY 0.92 179300