从pandas每天有多个条目的数据框中抽取x天数

时间:2017-08-07 07:09:18

标签: python pandas

我的数据框每天都有多个时间索引条目。我想抽样和x天(例如2天)和迭代前1天到天数范围的结束。我怎样才能做到这一点。

例如,如果每天有多个条目:

 datetime             value
 2015-12-02 12:02:35    1
 2015-12-02 12:02:44    2
 2015-12-03 12:39:05    4
 2015-12-03 12:39:12    7
 2015-12-04 14:27:41    2
 2015-12-04 14:27:45    8
 2015-12-07 09:52:58    3
 2015-12-07 13:52:15    5
 2015-12-07 13:52:21    9

我想迭代一次两天的样品,例如

 2015-12-02 12:02:35    1
 2015-12-02 12:02:44    2
 2015-12-03 12:39:05    4
 2015-12-03 12:39:12    7

然后

 2015-12-03 12:39:05    4
 2015-12-03 12:39:12    7
 2015-12-04 14:27:41    2
 2015-12-04 14:27:45    8

结尾
 2015-12-04 14:27:41    2
 2015-12-04 14:27:45    8
 2015-12-07 09:52:58    3
 2015-12-07 13:52:15    5
 2015-12-07 13:52:21    9

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:1)

您可以使用:

#https://stackoverflow.com/a/6822773/2901002
from itertools import islice

def window(seq, n=2):
    "Returns a sliding window (of width n) over data from the iterable"
    "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result    
    for elem in it:
        result = result[1:] + (elem,)
        yield result


dfs = [df[df['datetime'].dt.day.isin(x)] for x in window(df['datetime'].dt.day.unique())]
print (dfs[0])
             datetime  value
0 2015-12-02 12:02:35      1
1 2015-12-02 12:02:44      2
2 2015-12-03 12:39:05      4
3 2015-12-03 12:39:12      7

print (dfs[1])
             datetime  value
2 2015-12-03 12:39:05      4
3 2015-12-03 12:39:12      7
4 2015-12-04 14:27:41      2
5 2015-12-04 14:27:45      8