假设您输入了以下格式:
data = [(1, 5), (7, 2), (3, 4), (4, 8), (6, 3), (5, 2)]
我想将这些数字组织到一个单独的存储桶或列表中。
如果在另一个元组中找到来自元组的数字,那么这意味着这些数字应该进入同一个桶中;否则,进入一个不同的桶。
例如,从上面的示例中,数字将分配到两个桶中:
bucket_a = {1, 5, 2, 7}
因为这些元组:
(1, 5)
(5, 2)
(7, 2)
和
bucket_b = {3, 4, 6, 8}
因为这些元组:
(3, 4)
(4, 8)
答案 0 :(得分:2)
这基本上是一个连接组件问题,其中data
定义了图形的边缘。解决它的一种方法是使用disjoint-set data structure。这是一个例子:
>>> from collections import defaultdict
>>> sets = defaultdict(set)
>>> for a in range(1,9): sets[a].add(a)
...
>>> for a,b in data:
... s = sets[a] | sets[b]
... for c in s: sets[c] = s
...
>>> sets
defaultdict(<type 'set'>, {1: set([1, 2, 5, 7]), 2: set([1, 2, 5, 7]), 3: set([8, 3, 4, 6]), 4: set([8, 3, 4, 6]), 5: set([1, 2, 5, 7]), 6: set([8, 3, 4, 6]), 7: set([1, 2, 5, 7]), 8: set([8, 3, 4, 6])})
请注意,有比我在这里做的更好的脱节设置实现,但想法是一样的。
要获取唯一存储桶列表,只需按sets
id
获取>>> seen = set()
>>> for s in sets.values():
... if id(s) not in seen:
... print s
... seen.add(id(s))
...
set([1, 2, 5, 7])
set([8, 3, 4, 6])
中的唯一值:
this
答案 1 :(得分:1)
如果您需要跟踪存储桶和生成这些存储区的边缘,此解决方案使用{3:[1,2], 5:[4], 6:[3, 5, 1, 2, 4], 7:[5, 4], 8:[4], 9:[8, 4], 1:[], 2:[], 3:[]}
来跟踪这两者。
dict
示例:
def get_buckets(data):
buckets = {}
for x, y in data:
if x in buckets and y in buckets:
if buckets[x] is buckets[y]:
buckets[x].append((x, y))
else:
buckets[x].extend(buckets[y])
buckets[x].append((x, y))
for a in sum(buckets[y], ()):
buckets[a] = buckets[x]
elif x in buckets:
buckets[x].append((x, y))
buckets[y] = buckets[x]
elif y in buckets:
buckets[y].append((x, y))
buckets[x] = buckets[y]
else:
buckets[x] = buckets[y] = [(x, y)]
return {frozenset(sum(bucket, ())): bucket
for bucket in map(frozenset, buckets.values())}
答案 2 :(得分:0)
这可能不是最优的,但这是我的解决方案:
In [101]: data = [(1, 5), (7, 2), (3, 4), (4, 8), (6, 3), (5, 2)]
In [102]: def merge(data):
...: ndata = len(data)
...: sets = [set(data[0])]
...: for d in data[1:]:
...: found = False
...: for s in sets:
...: if s.intersection(d):
...: s.update(d)
...: found = True
...: break
...: if not found:
...: sets.append(set(d))
...: if len(sets) == ndata:
...: return sets
...: else:
...: return merge(sets)
...:
In [103]: merge(data)
Out[103]: [{1, 2, 5, 7}, {3, 4, 6, 8}]