假设我们获得了如下数据框:
Query Productid priority
index
0 3ds 2125233 0.018946
1 rca 2009324 0.027599
2 nook 1517163 0.009443
3 rca 2877125 0.012054
4 rca 2877134 0.005557
5 flatscreentvs 2416092 0.011961
6 macbook 3108172 0.010459
7 3ds 2264036 0.165948
8 rca 8280834 0.004006
9 memorycard 2740208 0.013744
10 acpowercord 2584273 0.006865
11 zaggiphone 1230537 0.136073
12 watchthethrone 3168067 0.104679
13 remotecontrolextender 7997055 0.113058
14 camcorder 2009041 0.017809
15 3ds 1988047 0.031711
16 3ds 1686079 0.043783
17 wirelessheadphones 3770439 0.014714
18 wirelessheadphones 2602403 0.008525
19 samsung40 2126065 0.018066
我希望根据给定查询的优先级找到前2 product_ids
。
例如。如果我们有query=3ds
那么前两个产品应该是:
1. 1988047
2. 1686079
答案 0 :(得分:1)
IIUC使用:
print (df.set_index('Productid').groupby('Query')['priority'].nlargest(2).reset_index())
Query Productid priority
0 3ds 2264036 0.165948
1 3ds 1686079 0.043783
2 acpowercord 2584273 0.006865
3 camcorder 2009041 0.017809
4 flatscreentvs 2416092 0.011961
5 macbook 3108172 0.010459
6 memorycard 2740208 0.013744
7 nook 1517163 0.009443
8 rca 2009324 0.027599
9 rca 2877125 0.012054
10 remotecontrolextender 7997055 0.113058
11 samsung40 2126065 0.018066
12 watchthethrone 3168067 0.104679
13 wirelessheadphones 3770439 0.014714
14 wirelessheadphones 2602403 0.008525
15 zaggiphone 1230537 0.136073
答案 1 :(得分:0)
这对Oracle的row_number()分析函数来说是等价的:
Randoop
显示所选In [172]: df.assign(rn=df.sort_values('priority', ascending=0).groupby('Query').cumcount() + 1).query('rn < 3').sort_values(['Query','rn'])
Out[172]:
Query Productid priority rn
index
7 3ds 2264036 0.165948 1
16 3ds 1686079 0.043783 2
10 acpowercord 2584273 0.006865 1
14 camcorder 2009041 0.017809 1
5 flatscreentvs 2416092 0.011961 1
6 macbook 3108172 0.010459 1
9 memorycard 2740208 0.013744 1
2 nook 1517163 0.009443 1
1 rca 2009324 0.027599 1
3 rca 2877125 0.012054 2
13 remotecontrolextender 7997055 0.113058 1
19 samsung40 2126065 0.018066 1
12 watchthethrone 3168067 0.104679 1
17 wirelessheadphones 3770439 0.014714 1
18 wirelessheadphones 2602403 0.008525 2
11 zaggiphone 1230537 0.136073 1
的{{1}}:
Productid