将lambda函数应用于datetime

时间:2016-10-06 20:53:16

标签: python lambda

我使用以下代码查找列表中差异为< = 1的群集

from itertools import groupby
from operator import itemgetter
data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
for k, g in groupby(enumerate(data), lambda (i, x): (i-x)):
    print map(itemgetter(1), g)

但是,如果我将data更改为日期时间数组,以查找相隔仅1小时的日期时间群,则会失败。

我正在尝试以下方法:

>>> data
array([datetime.datetime(2016, 10, 1, 8, 0),
       datetime.datetime(2016, 10, 1, 9, 0),
       datetime.datetime(2016, 10, 1, 10, 0), ...,
       datetime.datetime(2019, 1, 3, 9, 0),
       datetime.datetime(2019, 1, 3, 10, 0),
       datetime.datetime(2019, 1, 3, 11, 0)], dtype=object)

    from itertools import groupby
    from operator import itemgetter
    data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
    for k, g in groupby(enumerate(data), lambda (i, x): (i-x).total_seconds()/3600):
        print map(itemgetter(1), g)

错误是:

    for k, g in groupby(enumerate(data), lambda (i, x): int((i-x).total_seconds()/3600)):
TypeError: unsupported operand type(s) for -: 'int' and 'datetime.datetime'

网上有很多解决方案,但我想将这个特定的解决方案用于学习。

1 个答案:

答案 0 :(得分:1)

如果你想获得项目的所有子序列,使得每个项目比前一个项目晚一个小时(不是每个项目的集群都在一个小时内),你需要迭代对(data[i-1], data[i]) 。目前,当您尝试从(i, data[i])中减去TypeError时,您只是在data[i]上进行迭代,从而引发i。一个工作示例可能如下所示:

from itertools import izip

def find_subsequences(data):
    if len(data) <= 1:
        return []

    current_group = [data[0]]
    delta = 3600
    results = []

    for current, next in izip(data, data[1:]):
        if abs((next - current).total_seconds()) > delta:
            # Here, `current` is the last item of the previous subsequence
            # and `next` is the first item of the next subsequence.
            if len(current_group) >= 2:
                results.append(current_group)

            current_group = [next]
            continue

        current_group.append(next)

    return results