如何根据缺少的日期将缺少日期的日期时间列表拆分到列表中?
使用以下示例:
date_list = [
datetime.datetime(2012,1,1,0,0,0),
datetime.datetime(2012,1,2,0,0,0),
datetime.datetime(2012,1,4,0,0,0),
datetime.datetime(2012,1,7,0,0,0),
datetime.datetime(2012,1,8,0,0,0),
]
我在这里寻找的结果是
[[datetime.datetime(2012,1,1,0,0,0), datetime.datetime(2012,1,2,0,0,0)]
[datetime.datetime(2012,1,4,0,0,0)],
[datetime.datetime(2012,1,7,0,0,0), datetime.datetime(2012,1,8,0,0,0)]]
我尝试使用groupby
,但我无法弄清楚要用什么键。
[list(g) for k, g in itertools.groupby(date_list, key=lambda d: d.day)]
答案 0 :(得分:2)
这适用于给定的示例......
>>> import datetime
>>> date_list = [
... datetime.datetime(2012,1,1,0,0,0),
... datetime.datetime(2012,1,2,0,0,0),
... datetime.datetime(2012,1,4,0,0,0),
... datetime.datetime(2012,1,7,0,0,0),
... datetime.datetime(2012,1,8,0,0,0),
... ]
>>> import itertools
>>> [list(g) for k, g in itertools.groupby(enumerate(date_list), key=lambda (i, x): i-x.day)]
[[(0, datetime.datetime(2012, 1, 1, 0, 0)), (1, datetime.datetime(2012, 1, 2, 0, 0))], [(2, datetime.datetime(2012, 1, 4, 0, 0))], [(3, datetime.datetime(2012, 1, 7, 0, 0)), (4, datetime.datetime(2012, 1, 8, 0, 0))]]
如果您不想要索引,这可能会更好......
>>> [[v for i, v in g] for k, g in itertools.groupby(enumerate(date_list), key=lambda (i, x): i-x.day)]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)], [datetime.datetime(2012, 1, 4, 0, 0)], [datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]
答案 1 :(得分:2)
这是一个无聊的for-loop辅助函数。
def date_segments(dates):
output = []
cur_list = [dates[0]]
for dt_pair in zip(dates[1:], dates):
if (dt_pair[0] - dt_pair[1]).days > 1:
output.append(cur_list)
cur_list = [dt_pair[0]]
else:
cur_list.append(dt_pair[0])
output.append(cur_list)
return output
给出:
In [28]: date_segments(date_list)
Out[28]:
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)],
[datetime.datetime(2012, 1, 4, 0, 0)],
[datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]
如果我将itertools.groupby
方法定义为名为other_way
的辅助函数,如下所示:
from itertools import groupby
def other_way(date_list):
return [[v for i, v in g] for k, g in groupby(enumerate(date_list),
key=lambda (i, x): i-x.day)]
然后对于这个公认的小例子timeit
显示这种for循环方法稍微快一些:
In [31]: %timeit date_segments(date_list)
100000 loops, best of 3: 3.2 µs per loop
In [32]: %timeit other_way(date_list)
100000 loops, best of 3: 3.72 µs per loop
和我一样,发现for循环方法更像Pythonic并且可读。
答案 2 :(得分:1)
您可以构建一个在没有连续日期时“切换”的键:
class Switcher():
def __call__(self, d):
if not hasattr(self, 'prev'): # first element: init switch
self.switch = 1
elif (d - self.prev).days > 1: # not consecutive: invert switch
self.switch *= -1
self.prev = d # save current value
return self.switch
然后你就可以使用它:
>>> [list(g) for k, g in groupby(date_list, key = Switcher())]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 2, 0, 0)],
[datetime.datetime(2012, 1, 4, 0, 0)],
[datetime.datetime(2012, 1, 7, 0, 0), datetime.datetime(2012, 1, 8, 0, 0)]]