Question

如果我有以下数据框：

  date       A     B    M     S
 20150101    8     7    7.5   0
 20150101    10    9    9.5   -1
 20150102    9     8    8.5   1
 20150103    11    11   11    0
 20150104    11    10   10.5  0
 20150105    12    10   11    -1
 ...

如果我想通过以下规则创建另一个列'cost'：

如果S＆lt; 0，成本=（M-B）.shift（1）* S
如果S＆gt; 0，成本=（M-A）.shift（1）* S
如果S == 0，则成本= 0

目前，我正在使用以下功能：

def cost(df):
if df[3]<0:
    return np.roll((df[2]-df[1]),1)*df[3]
elif df[3]>0:
    return np.roll((df[2]-df[0]),1)*df[3]
else:
    return 0
df['cost']=df.apply(cost,axis=0)

还有其他办法吗？我可以在用户定义的函数中以某种方式使用pandas shift函数吗？感谢。

Answer 1

以这种方式执行此操作通常很昂贵，因为当您apply用户定义的函数时，您将失去矢量速度优势。相反，如何使用the numpy version of the ternary operator：

import numpy as np

np.where(df[3] < 0,
    np.roll((df[2]-df[1]),1),
    np.where(df[3] > 0,
        np.roll((df[2]-df[0]),1)*df[3] 
        0))

（当然将其分配给df['cost']）。

pandas数据帧中的回调/移位调用函数

1 个答案: