我很难与组内的数据排序相关。我有按性别和国家(“ GEO”)分组的9个不同职业的平均工资数据。我想要一个数据框,其中按平均工资对每个国家和性别进行职业排序。 这样我就为每个国家和地区按性别订购了9个职业。 这就是我所拥有的:
df
wage Country SEX OCCUPATION
0 6 BELGIUM M Elementary
1 4 BELGIUM M POLICE
2 6 BELGIUM M MANAGERS
3 8 BELGIUM M PROFESSIONALS
2 6 BELGIUM F PROFESSOIONALS
3 8 BELGIUM F MANAGERS
4 7 BELGIUM F POLICE
5 5 FRANCE M POLICE
6 3 FRANCE M PROFESSIONALS
7 2 FRANCE M MANAGERS
但是我想要这个:
wage Country SEX OCCUPATION
1 4 BELGIUM M POLICE
0 6 BELGIUM M Elementary
2 6 BELGIUM M MANAGERS
3 8 BELGIUM M PROFESSIONALS
2 6 BELGIUM F PROFESSOIONALS
4 7 BELGIUM F POLICE
3 8 BELGIUM F MANAGERS
7 2 FRANCE M MANAGERS
6 3 FRANCE M PROFESSIONALS
5 5 FRANCE M POLICE
最后,如果可能的话,我想按工资的顺序从1:职业数量中分配一个数字。 为了说明:
wage Country SEX OCCUPATION ORDER
1 4 BELGIUM M POLICE 1
0 6 BELGIUM M Elementary 2
2 6 BELGIUM M MANAGERS 3
3 8 BELGIUM M PROFESSIONALS 4
2 6 BELGIUM F PROFESSOIONALS 1
4 7 BELGIUM F POLICE 2
3 8 BELGIUM F MANAGERS 3
7 2 FRANCE M MANAGERS 1
6 3 FRANCE M PROFESSIONALS 2
5 5 FRANCE M POLICE 3
这个问题与:pandas groupby sort within groups有关。我已经读过了,但没有用: 我试图达到我想要的df的目的:
df=df.sort_values(["Country","SEX","wage"],ascending=False).groupby(["Country","SEX"])
不幸的是,python返回了它而不是数据帧:
<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x0000022CC92DC668>
"GEO","SEX","occ", are all objects
obs_value is a float.
the df is a dataframe
如果有人可以帮助我,我将非常感激