我有一个这样的数据框:
student class subject date status
jack class-1 maths 20150101 fail
jack class-1 maths 20150205 win
jack class-1 maths 20150310 fail
jack class-1 maths 20150415 fail
mathew class-2 maths 20150102 win
mathew class-2 maths 20150208 win
mathew class-2 maths 20150315 win
john class-3 maths 20150125 fail
这是不同日期学生的数学竞争状况,
有些学生在某些日期不会错过比赛。
我如何使用pandas pivot table
函数
student class subject fail win
jack class-1 maths 3 1
mathew class-2 maths 0 3
john class-3 maths 1 0
答案 0 :(得分:0)
您可以pivot_table
使用reset_index
:
df = df.pivot_table(index=['student','class','subject'],
columns='status',
values='date',
aggfunc=len,
fill_value=0).reset_index()
print (df)
status student class subject fail win
0 jack class-1 maths 3 1
1 john class-3 maths 1 0
2 mathew class-2 maths 0 3
最后,您可以按rename_axis
删除列名称(pandas
0.18.0
中的新内容):
df = df.rename_axis(None, axis=1)
#pandas bellow 0.18.0
#df.columns.name = None
print (df)
student class subject fail win
0 jack class-1 maths 3 1
1 john class-3 maths 1 0
2 mathew class-2 maths 0 3