Question

我有一个arrrays列表，我希望将其放入重叠值组中。我的直觉是使用itertools.groupby，但我不确定如何使它工作。

一些示例数据：

a = np.array(range(10))
b = np.array(range(90,100))
c = np.array(range(50,60))
d = np.array(range(8,15))
e = np.array(range(55,80))

我想最终得到三组重叠（或非连续）数组：

groups = [[a,d],[b],[c,e]]

我可以使用itertools.groupby来做这件事吗？

for k,g in itertools.groupby([a,b,c,d,e], lambda x: SOMETHING?):
    groups.append(list(g))

但是我不确定要分类和分组。使用此方法或任何其他方法的任何建议？谢谢！

更新：

感谢@abarnert提供以下解决方案。你是对的，它不是一个庞大的数组，所以迭代蛮力工作正常。我也用一些笨重的列表理解来做到这一点：

arrays, groups, idx = [a,b,c,d,e], [], []
for N,X in enumerate(arrays):
  if N not in idx:
    group, idx = [X], idx+[N]
    for n,x in enumerate(arrays):
      if n not in idx and any(np.where(np.logical_and(X<x[-1],X>x[0]))[0]): group.append(x), idx.append(n)
    groups.append(group)

Answer 1

如果您的范围列表足够小，您可以通过强力执行此操作：检查每个范围与其他范围的重叠，＆＃34;合并＆＃34;他们每次找到它时都会开始循环。

使用numpy数组编写这有点笨拙，所以让我们使用（Python 3）范围对象来解决这个问题：

def merge(x, y):
    return range(min(x.start, y.start), max(x.stop, y.stop))

def overlap(x, y):
    return x.start in y or y.start in x

groups = {a: {a}, b: {b}, c: {c}, d: {d}, e: {e}}

while True:
    for key, value in groups.items():
        for otherkey, othervalue in groups.items():
            if key is otherkey:
                continue
            if overlap(key, otherkey):
                del groups[otherkey]
                del groups[key]
                groups[merge(key, otherkey)] = value | othervalue
                break
        else:
            continue
        break
    else:
        break

这显然是一个浪费的算法，但考虑到你没有足够的对象将它们分配给变量，谁在乎？它应该很容易理解，而且在这里可能更重要。

组重叠数组

1 个答案: