Question

我在大熊猫面临问题，我对数据进行了很多更改。但最终我不知道哪一个变化导致了该列的最终价值状态。

例如我更改这样的卷。但我运行了很多像这样的支票：

# Last check 
for i in range(5):
    df_gp.tail(1).loc[ (df_gp['volume']<df_gp['volume'].shift(1)) | (df_gp['volume']<0.4),['new_volume']  ] = df_gp['new_volume']*1.1

我不仅要更新'new_volume'列，还要更新条件符合列'评论'。

是否可以将其添加到某处，以便“评论”与“new_volume”同时更新？

谢谢！

Answer 1

是的，可以通过assign，但在我看来可读性较差，更好的是通过缓存在变量中的布尔掩码单独更新每一列：

df_gp = pd.DataFrame({'volume':[.1,.3,.5,.7,.1,.7],
                     'new_volume':[5,3,6,9,2,4],
                     'commentary':list('aaabbb')})

print (df_gp)
   volume  new_volume commentary
0     0.1           5          a
1     0.3           3          a
2     0.5           6          a
3     0.7           9          b
4     0.1           2          b
5     0.7           4          b

#create boolean mask and assign to variable for reuse
m = (df_gp['volume']<df_gp['volume'].shift(1)) | (df_gp['volume']<0.4)

#change columns by assign by condition and assign back only filtered columns 
c = ['commentary','new_volume']
df_gp.loc[m, c] = df_gp.loc[m, c].assign(new_volume=df_gp['new_volume']*1.1
                                         commentary='updated')
print (df_gp)
   volume  new_volume commentary
0     0.1         5.5    updated
1     0.3         3.3    updated
2     0.5         6.0          a
3     0.7         9.0          b
4     0.1         2.2    updated
5     0.7         4.0          b

#multiple filtered column by scalar
df_gp.loc[m, 'new_volume'] *= 1.1
#append new value to filtered column
df_gp.loc[m, 'commentary'] = 'updated'
print (df_gp)
   volume  new_volume commentary
0     0.1         5.5    updated
1     0.3         3.3    updated
2     0.5         6.0          a
3     0.7         9.0          b
4     0.1         2.2    updated
5     0.7         4.0          b

Python Pandas .loc一次更新2列

1 个答案: