给出如下数据框:
id value1 value2
0 3918703 62.0 64.705882
1 3919144 60.0 60.000000
2 3919534 62.5 30.000000
3 3919559 55.0 55.000000
4 3920438 82.0 82.031250
5 3920463 71.0 71.428571
6 3920502 70.0 69.230769
7 3920535 80.0 40.000000
8 3920674 62.0 62.222222
9 3920856 80.0 79.987176
我要检查value2
是否在value1
的正负10%范围内,并返回新列result_review
。
如果它不在要求的范围内,则将No
表示为result_review
的值。
id value1 value2 results_review
0 3918703 62.0 64.705882 NaN
1 3919144 60.0 60.000000 NaN
2 3919534 62.5 30.000000 no
3 3919559 55.0 55.000000 NaN
4 3920438 82.0 82.031250 NaN
5 3920463 71.0 71.428571 NaN
6 3920502 70.0 69.230769 NaN
7 3920535 80.0 40.000000 no
8 3920674 62.0 62.222222 NaN
9 3920856 80.0 79.987176 NaN
如何在熊猫中做到这一点?感谢您的提前帮助。
答案 0 :(得分:2)
将Series.between
与DataFrame.loc
一起使用:
m = df['value2'].between(df['value1'].mul(0.9), df['value1'].mul(1.1))
df.loc[~m, 'results_review'] = 'no'
print (df)
id value1 value2 results_review
0 3918703 62.0 64.705882 NaN
1 3919144 60.0 60.000000 NaN
2 3919534 62.5 30.000000 no
3 3919559 55.0 55.000000 NaN
4 3920438 82.0 82.031250 NaN
5 3920463 71.0 71.428571 NaN
6 3920502 70.0 69.230769 NaN
7 3920535 80.0 40.000000 no
8 3920674 62.0 62.222222 NaN
9 3920856 80.0 79.987176 NaN