Question

我有一个包含匹配结果的pandas数据帧matches，如下所示：

year    winner      loser   score
1990    A           B       6-0
1990    B           C       5-0 RET
1990    A           B       4-0 RET
1990    C           C       6-0
1991    A           B       6-1
1991    A           C       4-1 RET
1991    B           A       6-4
1991    C           A       3-0 RET

我想为每位玩家每年创建一个新的数据框，其中包含 wins ， loss 和赢得退休。最终输出应该是：

year    player      wins    losses      rets
1990    A           2       0           1
1990    B           1       2           1
1990    C           1       2           0
1991    A           2       2           1
1991    B           1       1           0
1991    C           1       1           1

对于输赢，我可以成功地做到这一点。我这样做：

w_group = matches.groupby(['year', 'winner']).size()
l_group = matches.groupby(['year', 'loser']).size()

然后创建一个新的数据框：

scores = pd.DataFrame({'wins' : w_group, 'losses' : l_group}).fillna(0)
#name the index
scores.index.names = ['year','player']

然而，为了通过退休计算胜利，我不知道如何实现该列。我试过这个：

ret_group = matches.groupby(['year', 'winner']).apply(lambda x: x[(x['score'].str.contains('RET').fillna(False))].count())

但这给了我以下例外：

raise KeyError('%s not in index' % objarr[mask])
KeyError: '[ 0.] not in index'

非常感谢您的解决方案

Answer 1

我改变了

ret_group = matches.groupby(['year', 'winner']).apply(lambda x: x[(x['score'].str.contains('RET').fillna(False))].count())

到

ret_group = matches.groupby(['year', 'winner']).apply(lambda x: (x['score'].str.contains('RET').fillna(False)).sum())

现在可行。

基于group by和aggregate创建新的数据帧

1 个答案: