数据帧列的连续递增值的统计信息

时间:2014-01-04 16:06:21

标签: python numpy pandas

我正在尝试计算列的连续更高值的条纹的统计数据(min,max,avg ...)。我对熊猫和统计数据比较陌生,搜索了一下但找不到答案。

数据是财务数据,列中有OHLC值,例如

              Open     High     Low     Close
Date                                          
2013-10-20  1.36825  1.38315  1.36502  1.38029
2013-10-27  1.38072  1.38167  1.34793  1.34858   
2013-11-03  1.34874  1.35466  1.32941  1.33664   
2013-11-10  1.33549  1.35045  1.33439  1.34950  
....

例如平均连续较高的低条纹。

稍后编辑

我想我没解释清楚。无法再次计算序列中计数的项目。所以对于序列:

1,2,3,4,1,2,3,3,2,1

有4条条纹:1,2,3,4 | 1,2,3,3 | 2 | 1

max = 4
min = 1
avg = (4+4+1+1)/4 = 2.5

1 个答案:

答案 0 :(得分:0)

import pandas as pd
import numpy as np

s = pd.Series([1,2,3,4,1,2,3,3,2,1])

def ascends(s):
    diff = np.r_[0, (np.diff(s.values)>=0).astype(int), 0]
    diff2 = np.diff(diff)
    descends = np.where(np.logical_not(diff)[1:] & np.logical_not(diff)[:-1])[0]
    starts = np.sort(np.r_[np.where(diff2 > 0)[0], descends])
    ends = np.sort(np.r_[np.where(diff2 < 0)[0], descends])
    return ends - starts + 1

b = ascends(s)
print b
print b.max()
print b.min()
print b.mean()

reference

输出:

[4 4 1 1]
4
1
2.5