Python根据缺失的天数拆分日期时间列表

时间:2014-12-05 00:07:14

标签: python python-datetime

如何根据缺少的日期将缺少日期的日期时间列表拆分到列表中?

使用以下示例:

date_list = [
        datetime.datetime(2012,1,1,0,0,0), 
        datetime.datetime(2012,1,2,0,0,0), 
        datetime.datetime(2012,1,4,0,0,0), 
        datetime.datetime(2012,1,7,0,0,0),
        datetime.datetime(2012,1,8,0,0,0),
        ]

我在这里寻找的结果是

[[datetime.datetime(2012,1,1,0,0,0), datetime.datetime(2012,1,2,0,0,0)]
[datetime.datetime(2012,1,4,0,0,0)], 
[datetime.datetime(2012,1,7,0,0,0), datetime.datetime(2012,1,8,0,0,0)]]

我尝试使用groupby,但我无法弄清楚要用什么键。

[list(g) for k, g in itertools.groupby(date_list, key=lambda d: d.day)]

3 个答案:

答案 0 :(得分:2)

这适用于给定的示例......

>>> import datetime
>>> date_list = [
...         datetime.datetime(2012,1,1,0,0,0),
...         datetime.datetime(2012,1,2,0,0,0),
...         datetime.datetime(2012,1,4,0,0,0),
...         datetime.datetime(2012,1,7,0,0,0),
...         datetime.datetime(2012,1,8,0,0,0),
...         ]
>>> import itertools
>>> [list(g) for k, g in itertools.groupby(enumerate(date_list), key=lambda (i, x): i-x.day)]
[[(0, datetime.datetime(2012, 1, 1, 0, 0)), (1, datetime.datetime(2012, 1, 2, 0, 0))], [(2, datetime.datetime(2012, 1, 4, 0, 0))], [(3, datetime.datetime(2012, 1, 7, 0, 0)), (4, datetime.datetime(2012, 1, 8, 0, 0))]]

如果您不想要索引,这可能会更好......

>>> [[v for i, v in g] for k, g in itertools.groupby(enumerate(date_list), key=lambda (i, x): i-x.day)]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)], [datetime.datetime(2012, 1, 4, 0, 0)], [datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]

答案 1 :(得分:2)

这是一个无聊的for-loop辅助函数。

def date_segments(dates):
    output = []
    cur_list = [dates[0]]
    for dt_pair in zip(dates[1:], dates):
        if (dt_pair[0] - dt_pair[1]).days > 1:
            output.append(cur_list)
            cur_list = [dt_pair[0]]
        else:
            cur_list.append(dt_pair[0])
    output.append(cur_list)
    return output

给出:

In [28]: date_segments(date_list)
Out[28]: 
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)],
 [datetime.datetime(2012, 1, 4, 0, 0)],
 [datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]

如果我将itertools.groupby方法定义为名为other_way的辅助函数,如下所示:

from itertools import groupby
def other_way(date_list):
    return [[v for i, v in g] for k, g in groupby(enumerate(date_list), 
                                                  key=lambda (i, x): i-x.day)]

然后对于这个公认的小例子timeit显示这种for循环方法稍微快一些:

In [31]: %timeit date_segments(date_list) 
100000 loops, best of 3: 3.2 µs per loop

In [32]: %timeit other_way(date_list)
100000 loops, best of 3: 3.72 µs per loop

和我一样,发现for循环方法更像Pythonic并且可读。

答案 2 :(得分:1)

您可以构建一个在没有连续日期时“切换”的键:

class Switcher():
    def __call__(self, d):
        if not hasattr(self, 'prev'):    # first element: init switch
            self.switch = 1
        elif (d - self.prev).days > 1:   # not consecutive: invert switch
            self.switch *= -1
        self.prev = d                    # save current value
        return self.switch

然后你就可以使用它:

>>> [list(g) for k, g in groupby(date_list, key = Switcher())]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)],
 [datetime.datetime(2012, 1, 4, 0, 0)],
 [datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]