汇总连续天数的行

时间:2013-09-11 03:37:40

标签: python pandas

我需要根据连续几天的措施总和进行一些计算。例如:

import pandas as pd
from pandas import Series
rng = pd.date_range('1/3/2000', periods=8)
rng = rng[:4].append(rng[5:])
ts = Series(randn(7).astype('int'), index=rng)
ts

Out[1]:
2000-01-03    0
2000-01-04    0
2000-01-05    0
2000-01-06   -1
2000-01-08    0
2000-01-09   -2
2000-01-10   -1
dtype: int64

我怎么能在这里总结连续日值,所以我会得到这样的东西?

Out[2]:
2000-01-03   -1
2000-01-04   -1
2000-01-05   -1
2000-01-06   -1
2000-01-08   -3
2000-01-09   -3
2000-01-10   -3
dtype: int64

[编辑] Similar problem solved in R

1 个答案:

答案 0 :(得分:1)

现在我找到了答案,问题似乎更简单了:

def ranks(series):
    """
    In an ORDERED series, this function identifies consecutive days
    giving each group an unique number identifier. Argument must be
    a pandas Series with datetime index.
    """
    td = series.index.to_series().diff()
    td[0] = timedelta64(1, 'D')
    res = []
    counter = 0
    for i in range(td.size):
        if td[i] > timedelta64(1, 'D'):
            counter += 1
        res.append(counter)
    return(Series(res, index=series.index))

从这里开始,大熊猫groupby会照顾它。 ; - )

df = DataFrame({'val':ts, 'gr':ranks(ts)})
gr = DataFrame({'val':ts, 'gr':ranks(ts)}).groupby('gr')
df.merge(gr.sum(), left_on='gr', right_index=True, how='outer')

enter image description here