我在熊猫中有以下数据框:
code diff pv
0 -34 100
1 34 100
2 16 100
3 -50 150
我想要的数据框是:
code diff pv flag
0 -344 100 excess
1 344 100 short
2 2 100 pass
3 -5 150 pass
4 -200 150 excess
5 200 150 short
flag
列的逻辑
short = diff is positive > pv
excess = diff is negative < -pv (pv is taken to be negative)
pass = if the diff is within range of +- PV
如何在熊猫中实现这一目标?
答案 0 :(得分:2)
最好使用numpy.select
:
m1 = df['diff'] > df['pv']
m2 = df['diff'] < -df['pv']
#if need check negative and positive diff
m1 = (df['diff'] > df['pv']) & (df['diff'] > 0)
m2 = (df['diff'] < -df['pv']) & (df['diff'] < 0)
df['flag'] = np.select([m1, m2], ['short','excess'], 'pass')
#solution with double np.where
df['flag'] = np.where(m1, 'short',
np.where(m2, 'excess', 'pass'))
print (df)
code diff pv flag
0 0 -344 100 excess
1 1 344 100 short
2 2 2 100 pass
3 3 -5 150 pass
4 4 -200 150 excess
5 5 200 150 short
答案 1 :(得分:0)
您可以使用比率df['diff'] / df['pv']
并使用字典映射:
ratio = df['diff'].div(df['pv']).clip(-1, 1)
# or ratio = np.minimum(1, np.maximum(-1, df['diff'] / df['pv']))
d = {-1: 'excess', 1: 'short'}
df['flag'] = ratio.map(d).fillna('pass')
print(df)
code diff pv flag
0 0 -344 100 excess
1 1 344 100 short
2 2 2 100 pass
3 3 -5 150 pass
4 4 -200 150 excess
5 5 200 150 short
答案 2 :(得分:0)
这就是我要做的
def func(row):
diff, pv = row['diff'], row['pv']
if diff > 0 and diff > pv:
return 'short'
elif diff < 0 and diff < -pv:
return 'excess'
elif -pv <= diff <= pv:
return 'pass'
df['flag'] = df.apply(func, axis=1)
我正在将func
应用于df
的每一行。
code diff pv flag
0 -344 100 excess
1 344 100 short
2 2 100 pass
3 -5 150 pass
4 -200 150 excess
5 200 150 short