屏蔽数据不等于另一组数据和存储结果

时间:2018-01-23 20:16:53

标签: python-3.x pandas dataframe

有谁知道如何修改“更改”数据框以仅评估真实的单元格?我想只将这些项目发送到df2中从df1到更改数据帧的更改。这取代了所有细胞,我不能单独使用“面具”,因为它是多维的。谢谢!

import pandas as pd
import numpy as np 
df1=pd.DataFrame({'Col1' : ['blue', 2, 3, 4], 'Col2' : [90, 99, 3, 97], 'Col3' : [11, 12, 13, 14]})
df2=pd.DataFrame({'Col1' : ['blue', 2, 6], 'Col2' : [90, 99, 99], 'Col3' : [11, 12, 13]})
mask=df2.ne(df1)
#Line in question    
changes=(df1.loc[mask.index].astype(str) + ' changed to: ***' + df2.loc[mask.index].astype(str)).fillna(df2.astype(str))

我希望输出看起来像:

Col1    Col2    Col3
0   blue    90  11
1   2   99  12
2   3 changed to: ***6  3 changed to: ***99.0   13
3   4 changed to: ***nan    97 changed to: ***nan   14 changed to: ***nan

2 个答案:

答案 0 :(得分:3)

IIUC,您可以将whereother参数see docs一起使用:

df1.where(df1.eq(df2), changes)

输出:

                   Col1                   Col2                   Col3
0                  blue                     90                     11
1                     2                     99                     12
2    3 changed to: ***6  3 changed to: ***99.0                     13
3  4 changed to: ***nan  97 changed to: ***nan  14 changed to: ***nan

答案 1 :(得分:2)

与Scott Boston的方法类似。 (相信他!)您可以使用where的变体:mask

df1.mask(df1.ne(df2), df2)

这告诉您,只要df1.ne(df2)True,请填写df2的值;否则,不要改变。

    Col1    Col2    Col3
0   blue    90.0    11.0
1   2       99.0    12.0
2   6       99.0    13.0
3   NaN     NaN     NaN