如何在此数据框中添加一个附加列,其中包含同一天销售量超过给定行中人员的人数:
day name sold
0 mon Ben 2
1 mon Amy 6
2 mon Sue 7
3 mon John 9
4 tues Ben 9
5 tues Amy 4
6 tues Sue 10
7 tues John 5
8 wed Ben 8
9 wed Amy 3
10 wed Sue 10
11 wed John 3
结果如下:
day name sold num_who_sold_more
0 mon Ben 2 3
1 mon Amy 6 2
2 mon Sue 7 1
3 mon John 9 0
4 tues Ben 9 1
5 tues Amy 4 3
6 tues Sue 10 0
7 tues John 5 2
8 wed Ben 8 1
9 wed Amy 3 2
10 wed Sue 10 0
11 wed John 3 2
这似乎是这样的:
df["num_who_sold_more"] = df.groupby(["day", "place"])["sold"].transform(
lambda x: x[x > the_row].count()
)
但我不确定如何从转换中访问the_row
。谢谢!
答案 0 :(得分:2)
分组' day' grep the' sold'列和.rank
:
>>> df.groupby('day')['sold'].rank(ascending=False).astype('int') - 1
0 3
1 2
2 1
3 0
4 1
5 3
6 0
7 2
8 1
9 2
10 0
11 2
dtype: int64