我在大熊猫面临问题,我对数据进行了很多更改。但最终我不知道哪一个变化导致了该列的最终价值状态。
例如我更改这样的卷。但我运行了很多像这样的支票:
# Last check
for i in range(5):
df_gp.tail(1).loc[ (df_gp['volume']<df_gp['volume'].shift(1)) | (df_gp['volume']<0.4),['new_volume'] ] = df_gp['new_volume']*1.1
我不仅要更新'new_volume'列,还要更新条件符合列'评论'。
是否可以将其添加到某处,以便“评论”与“new_volume”同时更新?
谢谢!
答案 0 :(得分:1)
是的,可以通过assign
,但在我看来可读性较差,更好的是通过缓存在变量中的布尔掩码单独更新每一列:
df_gp = pd.DataFrame({'volume':[.1,.3,.5,.7,.1,.7],
'new_volume':[5,3,6,9,2,4],
'commentary':list('aaabbb')})
print (df_gp)
volume new_volume commentary
0 0.1 5 a
1 0.3 3 a
2 0.5 6 a
3 0.7 9 b
4 0.1 2 b
5 0.7 4 b
#create boolean mask and assign to variable for reuse
m = (df_gp['volume']<df_gp['volume'].shift(1)) | (df_gp['volume']<0.4)
#change columns by assign by condition and assign back only filtered columns
c = ['commentary','new_volume']
df_gp.loc[m, c] = df_gp.loc[m, c].assign(new_volume=df_gp['new_volume']*1.1
commentary='updated')
print (df_gp)
volume new_volume commentary
0 0.1 5.5 updated
1 0.3 3.3 updated
2 0.5 6.0 a
3 0.7 9.0 b
4 0.1 2.2 updated
5 0.7 4.0 b
#multiple filtered column by scalar
df_gp.loc[m, 'new_volume'] *= 1.1
#append new value to filtered column
df_gp.loc[m, 'commentary'] = 'updated'
print (df_gp)
volume new_volume commentary
0 0.1 5.5 updated
1 0.3 3.3 updated
2 0.5 6.0 a
3 0.7 9.0 b
4 0.1 2.2 updated
5 0.7 4.0 b