有谁知道如何修改“更改”数据框以仅评估真实的单元格?我想只将这些项目发送到df2中从df1到更改数据帧的更改。这取代了所有细胞,我不能单独使用“面具”,因为它是多维的。谢谢!
import pandas as pd
import numpy as np
df1=pd.DataFrame({'Col1' : ['blue', 2, 3, 4], 'Col2' : [90, 99, 3, 97], 'Col3' : [11, 12, 13, 14]})
df2=pd.DataFrame({'Col1' : ['blue', 2, 6], 'Col2' : [90, 99, 99], 'Col3' : [11, 12, 13]})
mask=df2.ne(df1)
#Line in question
changes=(df1.loc[mask.index].astype(str) + ' changed to: ***' + df2.loc[mask.index].astype(str)).fillna(df2.astype(str))
我希望输出看起来像:
Col1 Col2 Col3
0 blue 90 11
1 2 99 12
2 3 changed to: ***6 3 changed to: ***99.0 13
3 4 changed to: ***nan 97 changed to: ***nan 14 changed to: ***nan
答案 0 :(得分:3)
IIUC,您可以将where
与other
参数see docs一起使用:
df1.where(df1.eq(df2), changes)
输出:
Col1 Col2 Col3
0 blue 90 11
1 2 99 12
2 3 changed to: ***6 3 changed to: ***99.0 13
3 4 changed to: ***nan 97 changed to: ***nan 14 changed to: ***nan
答案 1 :(得分:2)
与Scott Boston的方法类似。 (相信他!)您可以使用where
的变体:mask
。
df1.mask(df1.ne(df2), df2)
这告诉您,只要df1.ne(df2)
为True
,请填写df2
的值;否则,不要改变。
Col1 Col2 Col3
0 blue 90.0 11.0
1 2 99.0 12.0
2 6 99.0 13.0
3 NaN NaN NaN