将列表元素按彼此之间的差异分组

时间:2018-07-06 13:30:01

标签: python list grouping

提供列表

A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]

是否有一种简单的方法可以对所有连续元素之间的差小于3的聚类进行分组?

也就是说,获取类似的内容

[13, 15, 17, 19, 21], [58, 59, 61, 63], [78, 79, 81]

我想知道是否存在任何内置函数,但是找不到类似的东西。我试图使用groupby中的itertools来解决问题,但是我遇到了麻烦。预先谢谢你。

4 个答案:

答案 0 :(得分:1)

这是使用迭代的一种方法。

例如:

A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
res = []
temp = []
l = len(A)-1

for i,v in enumerate(A):
    if i+1 > l:
        break

    if abs(v - A[i+1]) < 3:
        temp.append(v)
    else:
        if temp:
            temp.append(v)
            res.append(temp)
            temp = []
print(res)

输出:

[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]

答案 1 :(得分:1)

根据您的评论

  

重要的是“存储元素直到”条件满足

您可以为此使用itertools.takewhile

  

takewhile(谓词,可迭代)-> takewhile对象

     

从迭代器返回连续的条目,只要   谓词对每个条目的求值为真。

此解决方案当然还有改进的余地,但是最重要的是使用takewhile

class Grouper:
    """simple class to perform comparison when called, storing last element given"""
    def __init__(self, diff):
        self.last = None
        self.diff = diff
    def predicate(self, item):
        if self.last is None:
            return True
        return abs(self.last - item) < self.diff
    def __call__(self, item):
        """called with each item by takewhile"""
        result = self.predicate(item)
        self.last = item
        return result


def group_by_difference(items, diff=3):
    results = []
    start = 0
    remaining_items = items
    while remaining_items:
        g = Grouper(diff)
        group = [*itertools.takewhile(g, remaining_items)]
        results.append(group)
        start += len(group)
        remaining_items = items[start:]
    return results

这使您可以将具有单例群集的项目分组。

[[1],
 [6],
 [13, 15, 17, 18, 19, 21],
 [29],
 [36],
 [53],
 [58, 59, 61, 63],
 [78, 79, 81],
 [102],
 [114]]

答案 2 :(得分:0)

类似于this answer,要求进行相同次数的跑步,您可以在此处使用numpy.split

import numpy as np

def plateaus(A, atol=3):
    runs = np.split(A, np.where(np.abs(np.diff(A)) >= atol)[0] + 1)
    return [list(x) for x in runs if len(x) > 1]

A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
print(plateaus(A))
[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]

无需对长度进行过滤,就可以像the itertools.takehwile approach by @sytech一样为您提供单例群集。

答案 3 :(得分:0)

您可以使用itertools.groupby

import itertools
A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
new_a = [(A[i+1]-A[i], A[i]) for i in range(len(A)-1)]
a = [[a, [c for _, c in b]] for a, b in itertools.groupby(new_a, key=lambda x:x[0] < 3)]
final_groups = [a[i][-1]+[a[i+1][-1][0]] if a[i+1][-1][0] - a[i][-1][-1] < 3 else a[i][-1] for i in range(len(a)-1) if a[i][0]]

输出:

[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]