我通过重新采样聚合数据,如下所示:
import quandl
import numpy as np
data = quandl.get("WIKI/KO", trim_start = "2000-12-12", trim_end = "2014-12-30")
data = data.ix[:40, ['Close']]
data['SIGNAL'] = np.random.randint(0,3, size=len(data))
data['SIGNAL'] = np.where((data['SIGNAL'] == 2), -1, data['SIGNAL'] )
data['SIGNAL'] = np.where((data.index >= '2001-02-01'), 0, data['SIGNAL'] )
data['WIN'] = 10
print(data.to_string())
WinPerYear = data['WIN'].loc[(data['SIGNAL'] != 0)].resample('M').sum()
CntPerYear = data['WIN'].loc[(data['SIGNAL'] != 0)].resample('M').count()
print(WinPerYear.to_string())
print(CntPerYear.to_string())
正确地得到以下结果:
Close SIGNAL WIN
Date
[...]
2001-01-17 57.94 -1 10
2001-01-18 57.13 -1 10
2001-01-19 55.81 -1 10
2001-01-22 55.69 -1 10
2001-01-23 56.88 0 10
2001-01-24 58.06 -1 10
2001-01-25 58.63 -1 10
2001-01-26 57.94 -1 10
2001-01-29 57.12 0 10
2001-01-30 57.91 0 10
2001-01-31 58.00 1 10
2001-02-01 57.44 0 10
2001-02-02 57.74 0 10
2001-02-05 59.20 0 10
2001-02-06 59.42 0 10
2001-02-07 60.00 0 10
2001-02-08 60.61 0 10
Date
2000-12-31 60
2001-01-31 160
Freq: M
Date
2000-12-31 6
2001-01-31 16
Freq: M
是否有一种简单的方法,即不更改所有子集逻辑,为所有不匹配的月份添加行?像2001/02没有匹配,所以我希望两个聚合都有一个0,如:
Date
2000-12-31 60
2001-01-31 160
2001-02-31 0
Freq: M
Date
2000-12-31 6
2001-01-31 16
2001-01-31 0
Freq: M
非常感谢和祝福, 即
答案 0 :(得分:0)
我通过添加新的计算解决了这个问题。字段
data['WIN_SIG']=0
data['WIN_SIG'][(data[Signal] != 0)] = data[Win]
在我聚合后立即删除。
谢谢和欢呼, 即