我有一个数据框
game_id move_number move colour avg_centi phase
0 03gDhPWr 1 e4 white NaN opening
1 03gDhPWr 2 d5 black 37.0 opening
2 03gDhPWr 3 e5 white 61.0 opening
3 03gDhPWr 4 c5 black -5.0 opening
4 03gDhPWr 5 Nf3 white 26.0 opening
... ... ... ... ... ... ...
110093 zzaiRa7s 36 a5+ black NaN endgame
110094 zzaiRa7s 37 Kxb5 white NaN endgame
110095 zzaiRa7s 38 c6+ black NaN endgame
110096 zzaiRa7s 39 Ka4 white NaN endgame
110097 zzaiRa7s 40 Q@b4# black NaN endgame
我想映射colour
列,以便在颜色为black
时将colour
列的值替换为phase
列中的值。我想使用熊猫map
或apply
函数来理想地做到这一点。不是replace
,这太慢了。
结果数据框应如下所示:
game_id move_number move colour avg_centi phase
0 03gDhPWr 1 e4 white NaN opening
1 03gDhPWr 2 d5 opening 37.0 opening
2 03gDhPWr 3 e5 white 61.0 opening
3 03gDhPWr 4 c5 opening -5.0 opening
4 03gDhPWr 5 Nf3 white 26.0 opening
... ... ... ... ... ... ...
110093 zzaiRa7s 36 a5+ endgame NaN endgame
110094 zzaiRa7s 37 Kxb5 white NaN endgame
110095 zzaiRa7s 38 c6+ engame NaN endgame
110096 zzaiRa7s 39 Ka4 white NaN endgame
110097 zzaiRa7s 40 Q@b4# endgame NaN endgame
我尝试了以下代码,但似乎不太起作用:
def wrangle_game_phase(x):
if x == 'black':
return phase
else:
return x
df['game_type'] = df['colour'].apply(wrangle_game_phase)
答案 0 :(得分:3)
对pd.DataFrame.loc
使用布尔索引:
m = df['colour'] == 'black'
df.loc[m, 'colour'] = df.loc[m, 'phase']
game_id move_number move colour avg_centi phase
0 03gDhPWr 1 e4 white NaN opening
1 03gDhPWr 2 d5 opening 37.0 opening
2 03gDhPWr 3 e5 white 61.0 opening
3 03gDhPWr 4 c5 opening -5.0 opening
4 03gDhPWr 5 Nf3 white 26.0 opening
110093 zzaiRa7s 36 a5+ endgame NaN endgame
110094 zzaiRa7s 37 Kxb5 white NaN endgame
110095 zzaiRa7s 38 c6+ endgame NaN endgame
110096 zzaiRa7s 39 Ka4 white NaN endgame
110097 zzaiRa7s 40 Q@b4# endgame NaN endgame
答案 1 :(得分:3)
使用np.where
例如::
df['game_type'] = np.where(df["colour"] == 'black', df["phase"], df["colour"])
答案 2 :(得分:0)
您尝试过.loc吗?请尝试:
df.loc[df['Color'] =='Black', 'Color'] = df['Phase']