In [167]:
df
Out[167]:
Gender University
0 Male A
1 Female B
2 Male C
3 Male D
4 Male E
5 Female A
6 Female B
7 Female C
8 Female D
9 Female E
In [168]:
df.groupby(['University','Gender'])['Gender'].size().unstack('Gender').fillna(0)
Out[168]:
现在,我想按照女性和男性从最高到最低排序,这样当我禁止绘图时,它将按降序排列。我尝试了很多方法但无济于事。
在我最后的尝试中,我尝试过:
df.groupby(['University','Gender'])['Gender'].size().unstack('Gender').fillna(0).sort_values(ascending=False)
TypeError: sort_values() missing 1 required positional argument: 'by'
有什么建议吗?
答案 0 :(得分:1)
您可以按一列或另一列进行排序:
print (df)
Gender University
0 Male A
1 Female B
3 Male D
4 Male E
5 Female A
2 Male C
3 Male D
4 Male E
5 Female A
6 Female B
7 Female C
8 Female D
4 Male E
5 Female A
6 Female B
3 Male D
4 Male E
5 Female A
7 Female C
8 Female D
9 Female E
df1 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by='Female', ascending=False)
print (df1)
Gender Female Male
University
A 4 1
B 3 0
C 2 1
D 2 3
E 1 4
df1.plot.bar()
df2 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by='Male', ascending=False)
print (df2)
Gender Female Male
University
E 1 4
D 2 3
A 4 1
C 2 1
B 3 0
df2.plot.bar()
如果按两列排序排序第二列排序只重复(D
,C
列):
df3 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by=['Female', 'Male'], ascending=False)
print (df3)
df3.plot.bar()