Pythonic基于熊猫时间序列级别创建递归系列1和0的方法

时间:2016-10-23 17:09:24

标签: python python-2.7 pandas

我正在尝试清理我有数据框的代码:

df = pd.DataFrame({'value': {'2016-09-21': 13.30,
  '2016-09-22': 12.02,
  '2016-09-23': 12.28,
  '2016-09-26': 14.5,
  '2016-09-27': 13.1,
  '2016-09-28': 12.39,
  '2016-09-29': 14.02}})

我根据电平有一个ON和OFF信号。当'价值'向上交叉14.39我希望有1直到它向下交叉12.50这样:

df
             value  sig
2016-09-21 13.3000    0
2016-09-22 12.0200    0
2016-09-23 12.2800    0
2016-09-26 14.5000    1
2016-09-27 13.1000    1
2016-09-28 12.3900    0
2016-09-29 14.0200    0

我正在通过循环来解决这个问题,但我很确定这是一个更好的方法。这是我的方法:

off, on, sig = 14.39, 12.50, 0
log = []
for level in df.itertuples():
    if level.value > off:
        sig = 1
    elif (sig == 1) & (level.value < on):
        sig = 0
    log.append([level.value, sig])
log = pd.DataFrame(log, index=df.index, columns=['value', 'sig'])

3 个答案:

答案 0 :(得分:1)

以下是使用pandas.Series.where方法的矢量化解决方案:

import numpy as np

ON, OFF = 14.39, 12.50
df['sig'] = 0                                 #  set the initial value to be 0
df['sig'] = (df.sig.where(df.value < ON, 1)   #  if value > ON, set it 1
                   .where((df.value < OFF) | (df.value > ON), np.nan)  
                                              #  if value < ON, and value > OFF, set it nan
                   .ffill().fillna(0))        # forward fill the nan value as they depend 
                                              # on their previous state, and fill initial 
                                              # value as 0
df

#           value   sig
#2016-09-21 13.30     0
#2016-09-22 12.02     0
#2016-09-23 12.28     0
#2016-09-26 14.50     1
#2016-09-27 13.10     1
#2016-09-28 12.39     0
#2016-09-29 14.02     0

类似的np.where()方法可能更清晰:

import numpy as np
df['sig'] = np.where(df.value > ON, 1, np.where(df.value < OFF, 0, np.nan))
df['sig'] = df.sig.ffill().fillna(0) 

答案 1 :(得分:0)

试试这个:

df['sig'] = (df['value'] < off & df['value'] > on).astype(int)

答案 2 :(得分:0)

脱离我的头顶并且未经证实。

v = df['value']
s = v.gt(14.39).sub(v.lt(12.5))
df['sig'] = s.where(s.ne(0). np.nan).ffill().add(1).div(2)