如何在条件下用另一个col值替换一个col值

时间:2018-11-17 15:21:05

标签: python pandas

我有一个这样的熊猫数据框

>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
>>> magic.from_buffer(open("testdata/test.pdf").read(1024))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'

,我想输入df.head(10) 7 RT (min) Area (Ab*s) Quality patch similarity 8 10.167 23278313 64 NaN NaN 9 10.167 23278313 47 NaN NaN 10 10.167 23278313 38 NaN NaN 28 10.333 3407159 49 10.167 0.983935 29 10.333 3407159 22 10.167 0.983935 30 10.333 3407159 16 10.167 0.983935 48 10.390 3299202 38 10.333 0.994514 49 10.390 3299202 35 10.333 0.994514 50 10.390 3299202 32 10.333 0.994514 68 10.516 2015786 50 10.390 0.988018 ,然后输入df['similarity']>0.99。 例如,df应该这样:

df['RT (min)'] = df['patch']

RT(分钟)中的48,49,50行替换为补丁中的48,49,50行

我尝试过

7   RT (min)    Area (Ab*s) Quality patch   similarity
8   10.167      23278313    64      NaN     NaN
9   10.167      23278313    47      NaN     NaN
10  10.167      23278313    38      NaN     NaN
28  10.333      3407159     49      10.167  0.983935
29  10.333      3407159     22      10.167  0.983935
30  10.333      3407159     16      10.167  0.983935
48  10.333      3299202     38      10.333  0.994514
49  10.333      3299202     35      10.333  0.994514
50  10.333      3299202     32      10.333  0.994514
68  10.516      2015786     50      10.390  0.988018

当我收到错误消息

p = df[df['similarity']>0.99].index.tolist()
df['RT (min)'][p] =df['patch'][p]

我不知道如何解决。

2 个答案:

答案 0 :(得分:1)

类似这样的东西:

mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']

例如:

df = pd.DataFrame({"RT":[10.1,10.2,10.4],"patch":[float("NaN"),10.3,10.3],"similarity":[float("NaN"),0.9,0.998]})

制作:

    RT  patch   similarity
0   10.1    NaN NaN
1   10.2    10.3    0.900
2   10.4    10.3    0.998

创建掩码并用于分配patch中的值

mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']

结果:

RT  patch   similarity
0   10.1    NaN NaN
1   10.2    10.3    0.900
2   10.3    10.3    0.998

答案 1 :(得分:0)

pd.Series.mask

您可以分配如下:

df['RT'] = df['RT'].mask(df['similarity'] > 0.99, df['patch'])