Question

我有一个这样的数据框：

student     class       subject       date          status

jack        class-1     maths       20150101        fail
jack        class-1     maths       20150205        win
jack        class-1     maths       20150310        fail
jack        class-1     maths       20150415        fail
mathew      class-2     maths       20150102        win
mathew      class-2     maths       20150208        win
mathew      class-2     maths       20150315        win
john        class-3     maths       20150125        fail

这是不同日期学生的数学竞争状况，有些学生在某些日期不会错过比赛。我如何使用pandas pivot table函数

获得这样的结果

student     class       subject  fail   win
jack        class-1     maths      3     1
mathew      class-2     maths      0     3
john        class-3     maths      1     0

Answer 1

您可以pivot_table使用reset_index：

df = df.pivot_table(index=['student','class','subject'], 
                    columns='status', 
                    values='date', 
                    aggfunc=len,
                    fill_value=0).reset_index()
print (df)
status student    class subject  fail  win
0         jack  class-1   maths     3    1
1         john  class-3   maths     1    0
2       mathew  class-2   maths     0    3

最后，您可以按rename_axis删除列名称（pandas 0.18.0中的新内容）：

df = df.rename_axis(None, axis=1)
#pandas bellow 0.18.0
#df.columns.name = None
print (df)
  student    class subject  fail  win
0    jack  class-1   maths     3    1
1    john  class-3   maths     1    0
2  mathew  class-2   maths     0    3

使用pandas pivot table

1 个答案: