如何基于来自其他DataFrame的开始和结束信号创建pandas DataFrame?

时间:2014-01-25 02:02:02

标签: python pandas dataframe

因此,本质上我如何获取A(进入信号)和B(退出信号)并制作C(交易信号)

注意A是Entry信号,B是退出信号。 所以:

A    B    C
0    1    0 (no entry signal yet, so 0)
1    0    1 (got entry signal and haven't gotten exit signal yet, so 1)
0    0    1 (got entry signal and haven't gotten exit signal yet, so 1)
0    1    1 (got exit signal and currently in position, so change from 1 to 0)
0    0    0 (nothing, so stay at 0)
1    0    1 (got entry signal and haven't gotten exit signal yet, so 1)
0    1    1 (got exit signal and currently in position, so change from 1 to 0)
0    1    0 (got exit signal, but already at 0, so do nothing)
1    0    1 (got entry signal and haven't gotten exit signal yet, so 1)
0    0    1 (nothing, so stay at 0)

所以基本上,A中的1表示将C“打开”的信号,而B中的1表示将C“关闭”。如果C已经打开(C的前一个元素是1),则A中的1不执行任何操作,因为C已经打开。类似地,如果C已经关闭(C的前一个元素为0),则B中的1不执行任何操作,因为C已经关闭。基本上,A有一个进入交易的信号列表,而B有一个退出时间的信号列表,但是你只能在没有交易的情况下输入,你只能在你进入交易时退出交易,所以C是你是否在交易的清单。

我尝试实施您的解决方案,如下所示:

def generate_signals(self):
      signals = pandas.DataFrame(index=self.data.index)
      signals['Date'] = self.data['Date']
      signals['Close'] = self.data['Close']
      signals['fast_MA'] = pandas.stats.moments.ewma(self.data['Close'],
                                                     span=self.short_window)
      signals['slow_MA'] = pandas.stats.moments.ewma(self.data['Close'],
                                                     span=self.long_window)
      signalinfo = pandas.DataFrame(index=signals.index)
      signalinfo['entry_signals'] = numpy.where(signals['fast_MA'] >
                                                signals['slow_MA'], 1.0, 0.0)
      signals['stop'] = (data['Open'].shift(-1)
                         [(signalinfo['entry_signals'] == 1.0)
                          & (signalinfo['entry_signals']
                             .shift(1) == 0.0)])
      signals['stop'] = signals['stop'].fillna(0.0)
      signals['stop'] = signals['stop'].apply(lambda x: .97 * x)
      signals['stop'] = (signals['stop']
                         .replace(to_replace=0.0, method='ffill'))
      signalinfo['exit_signals'] = numpy.where(signals['Close'] <=
                                               signals['stop'], 1.0, 0.0)

      #process entry and exit signals to form trade signals
      signalinfo['Close'] = self.data['Close']
      mask = signalinfo.copy().astype(bool)
      signalinfo.entry_signals[mask.entry_signals] = signalinfo.index
      signalinfo.exit_signals[mask.exit_signals] = signalinfo.index
      signalinfo = signalinfo[mask].ffill().fillna(0)
      signalinfo['signal'] = (signalinfo['exit_signals']
                              < signalinfo['entry_signals']).astype(int)
      signalinfo['entry_signals'] = numpy.where(signals['fast_MA'] >
                                                signals['slow_MA'], 1.0, 0.0)
      signalinfo['exit_signals'] = numpy.where(signals['Close'] <=
                                               signals['stop'], 1.0, 0.0)
      signals['signal'][:self.long_window] = 0.0
      print signalinfo.head(50)
      print signalinfo.tail(50)
      return signals

这会得到以下代码:     A B C.     0 1 0(尚无输入信号,因此0)     1 0 1(有进入信号,还没有退出信号,所以1)     0 0 1(有进入信号但还没有退出信号,所以1)     0 1 0((应该是1)得到退出信号并且当前处于位置,因此从1变为0)     0 0 0(没什么,所以留0)     1 0 1(有进入信号,还没有退出信号,所以1)     0 1 0((应该是1)得到退出信号并且当前处于位置,因此从1变为0)     0 1 0(有退出信号,但已经为0,所以什么都不做)     1 0 1(有进入信号,还没有退出信号,所以1)     0 0 1(没有,所以保持0)

有关如何解决此问题的任何想法?我正在尝试使用你的解决方案中的ffill方法,但我收到了一个值错误。

1 个答案:

答案 0 :(得分:0)

我没有测试以下代码,但它可以在示例中获得您想要的内容,您可以使用更多数据进行测试吗?

基本思路是用A和B填充每个1的索引值,然后我们可以比较哪一个是大的,大一个是活动的:

import pandas as pd
import numpy as np
df = pd.read_csv("signal.csv", delim_whitespace=True)

mask = df.copy().astype(bool)
df.A[mask.A] = df.index
df.B[mask.B] = df.index
df = df[mask].ffill()
C = df.B < df.A
C.where(C).ffill(limit=1).fillna(0)

输出:

0    0
1    1
2    1
3    1
4    0
5    1
6    1
7    0
8    1
9    1
dtype: float64