我在熊猫中有以下数据框
code time_diff diff_flag quantity
123 0 zero 0.45
124 5 less than 6 0.80
125 8 no issue 0.78
126 18 no issue 2.78
127 28 no issue 4.78
我想从每行的数量中减去6,但diff_flag
为零且小于6。我希望的数据帧为
code time_diff diff_flag quantity new_diff
123 0 zero 0.45 Data Error
124 5 less than 6 0.80 Data Error
125 8 no issue 0.78 2
126 18 no issue 2.78 12
127 28 no issue 4.78 22
我怎么在熊猫里做?
答案 0 :(得分:1)
使用numpy.where
:
m = df['diff_flag'].isin(['zero','less than 6'])
df['new_diff'] = np.where(m, 'Data Error', df['time_diff'] - 6)
或者:
m1 = df['time_diff'] == 0
m2 = df['time_diff'] < 6
df['new_diff'] = np.where(m1 | m2, 'Data Error', df['time_diff'] - 6)
或者:
m = df['diff_flag'] == 'no issue'
df['new_diff'] = np.where(m, df['time_diff'] - 6, 'Data Error')
print (df)
code time_diff diff_flag quantity new_diff
0 123 0 zero 0.45 Data Error
1 124 5 less than 6 0.80 Data Error
2 125 8 no issue 0.78 2
3 126 18 no issue 2.78 12
4 127 28 no issue 4.78 22
答案 1 :(得分:1)
为什么不呢?
df['new_diff']=(df['time_diff']-6).clip(lower='Data Error')