我有一个这样的熊猫数据框
>>> import magic
>>> magic.from_file("testdata/test.pdf")
'PDF document, version 1.2'
>>> magic.from_buffer(open("testdata/test.pdf").read(1024))
'PDF document, version 1.2'
>>> magic.from_file("testdata/test.pdf", mime=True)
'application/pdf'
,我想输入df.head(10)
7 RT (min) Area (Ab*s) Quality patch similarity
8 10.167 23278313 64 NaN NaN
9 10.167 23278313 47 NaN NaN
10 10.167 23278313 38 NaN NaN
28 10.333 3407159 49 10.167 0.983935
29 10.333 3407159 22 10.167 0.983935
30 10.333 3407159 16 10.167 0.983935
48 10.390 3299202 38 10.333 0.994514
49 10.390 3299202 35 10.333 0.994514
50 10.390 3299202 32 10.333 0.994514
68 10.516 2015786 50 10.390 0.988018
,然后输入df['similarity']>0.99
。
例如,df应该这样:
df['RT (min)'] = df['patch']
RT(分钟)中的48,49,50行替换为补丁中的48,49,50行
我尝试过
7 RT (min) Area (Ab*s) Quality patch similarity
8 10.167 23278313 64 NaN NaN
9 10.167 23278313 47 NaN NaN
10 10.167 23278313 38 NaN NaN
28 10.333 3407159 49 10.167 0.983935
29 10.333 3407159 22 10.167 0.983935
30 10.333 3407159 16 10.167 0.983935
48 10.333 3299202 38 10.333 0.994514
49 10.333 3299202 35 10.333 0.994514
50 10.333 3299202 32 10.333 0.994514
68 10.516 2015786 50 10.390 0.988018
当我收到错误消息
p = df[df['similarity']>0.99].index.tolist()
df['RT (min)'][p] =df['patch'][p]
我不知道如何解决。
答案 0 :(得分:1)
类似这样的东西:
mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']
例如:
df = pd.DataFrame({"RT":[10.1,10.2,10.4],"patch":[float("NaN"),10.3,10.3],"similarity":[float("NaN"),0.9,0.998]})
制作:
RT patch similarity
0 10.1 NaN NaN
1 10.2 10.3 0.900
2 10.4 10.3 0.998
创建掩码并用于分配patch
中的值
mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']
结果:
RT patch similarity
0 10.1 NaN NaN
1 10.2 10.3 0.900
2 10.3 10.3 0.998
答案 1 :(得分:0)