使用Python中的日期列表循环24小时

时间:2018-06-15 08:23:09

标签: python python-3.x datetime python-datetime datetime64

我在Python中有一个np.datetime64日期列表:

['2016-12-01T02:00:00.000000000', '2016-12-01T04:00:00.000000000',
 '2016-12-01T06:00:00.000000000', '2016-12-01T08:00:00.000000000',
 '2016-12-01T10:00:00.000000000', '2016-12-01T12:00:00.000000000', 
 '2016-12-01T14:00:00.000000000', '2016-12-01T16:00:00.000000000', 
 '2016-12-01T18:00:00.000000000', '2016-12-01T20:00:00.000000000', 
 '2016-12-01T22:00:00.000000000', '2016-12-02T00:00:00.000000000', 
 '2016-12-02T02:00:00.000000000', '2016-12-02T04:00:00.000000000', 
 '2016-12-02T06:00:00.000000000', '2016-12-02T08:00:00.000000000', 
 '2016-12-02T10:00:00.000000000', '2016-12-02T12:00:00.000000000', 
 '2016-12-02T14:00:00.000000000', '2016-12-02T16:00:00.000000000', 
 '2016-12-02T18:00:00.000000000', '2016-12-02T20:00:00.000000000', 
 '2016-12-02T22:00:00.000000000', '2016-12-03T00:00:00.000000000', 
 '2016-12-03T02:00:00.000000000', '2016-12-03T04:00:00.000000000',
 '2016-12-03T06:00:00.000000000', '2016-12-03T08:00:00.000000000', 
 '2016-12-03T10:00:00.000000000', '2016-12-03T12:00:00.000000000', 
 '2016-12-03T14:00:00.000000000', '2016-12-03T16:00:00.000000000', 
 '2016-12-03T18:00:00.000000000', '2016-12-03T20:00:00.000000000', 
 '2016-12-03T22:00:00.000000000']

我希望在列表中的每个日历日循环。我试图从列表中提取每个唯一的日期(即找到最小和最大日期并创建这些日期之间的日期列表)但这对我想做的事情并不理想。

我希望的结果是让代码允许我循环遍历列表中的每个日期/日历日并获取与此日期对应的日期时间

for each_date in date_list:
    ***get all datetimes corresponding to each_date***

(loop would occur 3 times in this example)

注:

1)迭代每个[n:n + 24]或任何不会每天都不起作用的解决方案将具有相同的时间步数。

2 个答案:

答案 0 :(得分:3)

如果时间戳是有序的,我们可以使用itertools.groupby函数在相应的日期对数组元素进行分组。

可以使用np.datetime64.astype(..., dtype='datetime64[D]')获取日期,因此我们可以将其写为:

from numpy import datetime64
from functools import partial
from itertools import groupby

for day, timestamps in groupby(data_array,
                               partial(datetime64.astype, dtype='datetime64[D]')):
    # process day and timestamps
    pass

此处daydatetime64[D] numpy对象(仅包含当天),timestamps可迭代(不是列表,但我们可以将其转换为相应时间戳的列表。 data_array是包含初始数据的数组。

例如:

>>> for day, timestamps in groupby(data_array,
...                                partial(datetime64.astype, dtype='datetime64[D]')):
...     print((day, list(timestamps)))
... 
(numpy.datetime64('2016-12-01'), [numpy.datetime64('2016-12-01T02:00:00.000000000'), numpy.datetime64('2016-12-01T04:00:00.000000000'), numpy.datetime64('2016-12-01T06:00:00.000000000'), numpy.datetime64('2016-12-01T08:00:00.000000000'), numpy.datetime64('2016-12-01T10:00:00.000000000'), numpy.datetime64('2016-12-01T12:00:00.000000000'), numpy.datetime64('2016-12-01T14:00:00.000000000'), numpy.datetime64('2016-12-01T16:00:00.000000000'), numpy.datetime64('2016-12-01T18:00:00.000000000'), numpy.datetime64('2016-12-01T20:00:00.000000000'), numpy.datetime64('2016-12-01T22:00:00.000000000')])
(numpy.datetime64('2016-12-02'), [numpy.datetime64('2016-12-02T00:00:00.000000000'), numpy.datetime64('2016-12-02T02:00:00.000000000'), numpy.datetime64('2016-12-02T04:00:00.000000000'), numpy.datetime64('2016-12-02T06:00:00.000000000'), numpy.datetime64('2016-12-02T08:00:00.000000000'), numpy.datetime64('2016-12-02T10:00:00.000000000'), numpy.datetime64('2016-12-02T12:00:00.000000000'), numpy.datetime64('2016-12-02T14:00:00.000000000'), numpy.datetime64('2016-12-02T16:00:00.000000000'), numpy.datetime64('2016-12-02T18:00:00.000000000'), numpy.datetime64('2016-12-02T20:00:00.000000000'), numpy.datetime64('2016-12-02T22:00:00.000000000')])
(numpy.datetime64('2016-12-03'), [numpy.datetime64('2016-12-03T00:00:00.000000000'), numpy.datetime64('2016-12-03T02:00:00.000000000'), numpy.datetime64('2016-12-03T04:00:00.000000000'), numpy.datetime64('2016-12-03T06:00:00.000000000'), numpy.datetime64('2016-12-03T08:00:00.000000000'), numpy.datetime64('2016-12-03T10:00:00.000000000'), numpy.datetime64('2016-12-03T12:00:00.000000000'), numpy.datetime64('2016-12-03T14:00:00.000000000'), numpy.datetime64('2016-12-03T16:00:00.000000000'), numpy.datetime64('2016-12-03T18:00:00.000000000'), numpy.datetime64('2016-12-03T20:00:00.000000000'), numpy.datetime64('2016-12-03T22:00:00.000000000')])

因此,我们每天都选择打印相应timestamps的列表,但这当然是一个选项。与示例显示的不同,并非所有切片都具有相同的长度(最后两个切片具有额外的元素)

请注意timestamps是一个迭代器,因此如果你没有将它转换为一个列表就会耗尽,然后在一个循环之后,迭代器耗尽

groupby以线性时间工作,因为每次检查"组密钥"与前一个元素相同,但如前所述,必须对数据进行排序。

答案 1 :(得分:1)

您可以将collections.defaultdict用于O(n)解决方案。您可以使用Pandas来规范化datetime个对象,尽管这也可以通过NumPy实现。

import pandas as pd
from collections import defaultdict

d = defaultdict(list)

for item in L:
    day = pd.to_datetime(item).normalize().to_datetime64()
    d[day].append(item)

print(d)

defaultdict(list,
            {numpy.datetime64('2016-12-01T00:00:00.000000000'):
                 [numpy.datetime64('2016-12-01T02:00:00.000000000'),
                  ...
                  numpy.datetime64('2016-12-01T22:00:00.000000000')],
             numpy.datetime64('2016-12-02T00:00:00.000000000'):
                 [numpy.datetime64('2016-12-02T00:00:00.000000000'),
                  ...
                  numpy.datetime64('2016-12-02T22:00:00.000000000')],
             numpy.datetime64('2016-12-03T00:00:00.000000000'):
                 [numpy.datetime64('2016-12-03T00:00:00.000000000'),
                  ...
                  numpy.datetime64('2016-12-03T22:00:00.000000000')]})