将一系列日期时间划分为"顺序"的有效方法。日期时间?

时间:2016-02-24 21:48:03

标签: python datetime

让我们说"顺序"日期时间是彼此的特定时间间隔(即30分钟)内的日期时间;非连续日期时间是一个比彼此更长的时间段的日期时间。

给定由日期时间列表(作为字符串)组成的输入,我想得到连续日期时间列表的列表。我的解决方案如下,但我想知道是否有更好的方法:

list_of_datetime_strings: ['2016-02-26 10:30:00', '2016-02-26 11:00:00', 
'2016-02-25 11:30:00', '2016-02-25 12:00:00', '2016-02-25 12:30:00', 
'2016-02-26 12:30:00']

def find_datetime_sequences(list_of_datetime_strings, increment_in_minutes = 30):
    if not list_of_datetime_strings:
        return

    str_to_datetime = lambda cur_datetime: datetime.strptime(cur_datetime, "%Y-%m-%d %H:%M:%S")
    list__datetimes_sorted = sorted([str_to_datetime(cur_datetime) for cur_datetime in list_of_datetime_strings])

    list_of_datetime_lists = [[list__datetimes_sorted[0]]]

    for cur_datetime in list__datetimes_sorted[1:]:
        time_difference = (cur_datetime - list_of_datetime_lists[-1][-1]).seconds / 60            

        if time_difference == increment_in_minutes:
            list_of_datetime_lists[-1].append(cur_datetime)
        else:
            list_of_datetime_lists.append([cur_datetime])

    return list_of_datetime_lists

find_datetime_sequences(list_of_datetime_strings)

输出:

list_of_datetime_lists: [[datetime.datetime(2016, 2, 25, 11, 30), 
     datetime.datetime(2016, 2, 25, 12, 0), datetime.datetime(2016, 2, 25, 12, 30)], 
    [datetime.datetime(2016, 2, 26, 10, 30), datetime.datetime(2016, 2, 26, 11, 0)], 
    [datetime.datetime(2016, 2, 26, 12, 30)]]

有没有更好的方法来实现上述目标?

2 个答案:

答案 0 :(得分:2)

我没有更好的方法从字符串中制作datetime个对象或对它们进行排序。但我认为通过使用生成器而不是常规函数可以改善其余部分(如果没有别的话可读性)。

def sequencify(sorted_datetimes, increment_in_minutes=30):
    """Take a sorted list of datetime objects. Yield sequences as lists."""
    if not sorted_datetimes:
        return

    first, *rest = sorted_datetimes
    # python 2: first, rest = sorted_datetimes[0], sorted_datetimes[1:]
    sequence = [first]
    delta = datetime.timedelta(minutes=increment_in_minutes)
    while rest:
        first, *rest = rest
        if first - sequence[-1] > delta:
            yield sequence
            sequence = [first]
        else:
            sequence.append(first)
    yield sequence

使用基于索引的方法的替代版本,类似于@SimeonVisser所做的:

def sequencify(sorted_datetimes, increment_in_minutes=30):
    """Take a sorted list of datetime objects. Yield sequences as lists."""
    delta = datetime.timedelta(minutes=increment_in_minutes)
    start = 0
    for i in range(start, len(sorted_datetimes) - 1):
        if sorted_datetimes[i+1] - sorted_datetimes[i] > delta:
            yield sorted_datetimes[start:i+1]
            start = i + 1
    if sorted_datetimes:
        yield sorted_datetimes[start:]

无论哪种方式,调用者都需要进行最小的更改:只需添加list()

strings = [
    '2016-02-26 10:30:00',
    '2016-02-26 11:00:00',
    '2016-02-25 11:30:00',
    '2016-02-25 12:00:00',
    '2016-02-25 12:30:00',
    '2016-02-26 12:30:00'
]
sorted_datetimes = sorted(datetime.datetime.strptime(s, '%Y-%m-%d %H:%M:%S')
                          for s in strings)
print(list(sequencify(sorted_datetimes)))  # explicit conversion to list

输出:

[[datetime.datetime(2016, 2, 25, 11, 30),
  datetime.datetime(2016, 2, 25, 12, 0),
  datetime.datetime(2016, 2, 25, 12, 30)],
 [datetime.datetime(2016, 2, 26, 10, 30),
  datetime.datetime(2016, 2, 26, 11, 0)],
 [datetime.datetime(2016, 2, 26, 12, 30)]]

答案 1 :(得分:1)

以下基本上是相同的方法,但可能更容易维护:

var seaweedfs = new weedClient({
            server: "localhost",
            port: 9333
        });

这部分是为了检测两个日期之间的差异何时不是30分钟而我们需要在那里切割:

import datetime

strings = [
    '2016-02-26 10:30:00',
    '2016-02-26 11:00:00',
    '2016-02-25 11:30:00',
    '2016-02-25 12:00:00',
    '2016-02-25 12:30:00',
    '2016-02-26 12:30:00',
]

def find_datetime_sequences(strings, increment_in_minutes=30):
    if not strings:
        return

    dates = sorted([
        datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
        for s in strings
    ])

    delta = datetime.timedelta(minutes=increment_in_minutes)
    start = 0
    n_items = len(dates)
    cuts = []
    for index in range(n_items):
        next_index = index + 1
        if next_index == n_items and start != next_index:
            cuts.append((start, next_index))
        elif dates[next_index] - dates[index] != delta:
            cuts.append((start, next_index))
            start = next_index
    return [dates[i:j] for i, j in cuts]

这里的部分是为了确保,如果最后有一个日期时间需要进入一个自己的组,我们就这样做了:

elif dates[next_index] - dates[index] != delta:
    cuts.append((start, next_index))
    start = next_index