我有以下两个DataFrame,
stats
:
player_id player_name gp ab run hit
28920 S. Smith 1 2 1 3
33351 T. Mancini 0 0 0 0
30267 C. Gentry 0 0 0 0
34885 H. Kim 1 0 0 0
31988 J. Schoop 0 0 0 0
5908 J.J. Hardy 1 3 0 0
&安培; game
:
player_id player_name gp ab run hit
28920 S. Smith 1 4 1 1
33351 T. Mancini 1 1 0 1
34885 H. Kim 1 1 2 0
5908 J.J. Hardy 1 4 0 0
我想根据player_id
仅为在上一场比赛中活跃的玩家更新统计信息,以便最终统计数据DataFrame如下所示:
player_id player_name gp ab run hit
28920 S. Smith 2 6 2 4
33351 T. Mancini 1 1 0 1
30267 C. Gentry 0 0 0 0
34885 H. Kim 2 1 2 0
31988 J. Schoop 0 0 0 0
5908 J.J. Hardy 2 7 0 0
感谢您的时间和帮助!
答案 0 :(得分:6)
您可以使用set_index
和update
stats=stats.set_index(['player_id','player_name'])
game=game.set_index(['player_id','player_name'])
stats.update(game)
stats = stats.astype(int).reset_index()
stats
Out[452]:
player_id player_name gp ab run hit
0 28920 S.Smith 1 4 1 1
1 33351 T.Mancini 1 1 0 1
2 30267 C.Gentry 0 0 0 0
3 34885 H.Kim 1 1 2 0
4 31988 J.Schoop 0 0 0 0
5 5908 J.J.Hardy 1 4 0 0
由于您使用add
#stats=stats.set_index(['player_id','player_name'])
#game=game.set_index(['player_id','player_name'])
stats.add(game,fill_value=0).astype(int).reset_index()
Out[460]:
player_id player_name gp ab run hit
0 5908 J.J.Hardy 2 7 0 0
1 28920 S.Smith 2 6 2 4
2 30267 C.Gentry 0 0 0 0
3 31988 J.Schoop 0 0 0 0
4 33351 T.Mancini 1 1 0 1
5 34885 H.Kim 2 1 2 0