我正在尝试计算列的连续更高值的条纹的统计数据(min,max,avg ...)。我对熊猫和统计数据比较陌生,搜索了一下但找不到答案。
数据是财务数据,列中有OHLC值,例如
Open High Low Close
Date
2013-10-20 1.36825 1.38315 1.36502 1.38029
2013-10-27 1.38072 1.38167 1.34793 1.34858
2013-11-03 1.34874 1.35466 1.32941 1.33664
2013-11-10 1.33549 1.35045 1.33439 1.34950
....
例如平均连续较高的低条纹。
稍后编辑
我想我没解释清楚。无法再次计算序列中计数的项目。所以对于序列:
1,2,3,4,1,2,3,3,2,1
有4条条纹:1,2,3,4 | 1,2,3,3 | 2 | 1
max = 4
min = 1
avg = (4+4+1+1)/4 = 2.5
答案 0 :(得分:0)
import pandas as pd
import numpy as np
s = pd.Series([1,2,3,4,1,2,3,3,2,1])
def ascends(s):
diff = np.r_[0, (np.diff(s.values)>=0).astype(int), 0]
diff2 = np.diff(diff)
descends = np.where(np.logical_not(diff)[1:] & np.logical_not(diff)[:-1])[0]
starts = np.sort(np.r_[np.where(diff2 > 0)[0], descends])
ends = np.sort(np.r_[np.where(diff2 < 0)[0], descends])
return ends - starts + 1
b = ascends(s)
print b
print b.max()
print b.min()
print b.mean()
输出:
[4 4 1 1]
4
1
2.5