我在这里实时处理多边形数据,但问题很简单。 我有一个包含数千套多边形Indecies(整数)的庞大列表,我需要将列表尽可能“快速”简化为“连接”Indecies集合列表。 即包含也在另一个集合中的整数的任何集合在结果中变为一组。我已经阅读了几个可能涉及集合和解决方案的解决方案我所追求的是所有具有任何共性的集合的最终列表。
我在这里处理大量数据,但为了简单起见,这里有一些示例数据:
setA = set([0,1,2])
setB = set([6,7,8,9])
setC = set([4,5,6])
setD = set([3,4,5,0])
setE = set([10,11,12])
setF = set([11,13,14,15])
setG = set([16,17,18,19])
listOfSets = [setA,setB,setC,setD,setE,setF,setG]
在这种情况下,我会在列表中找到这样的结果,尽管排序是无关紧要的:
connectedFacesListOfSets = [set([0,1,2,3,4,5,6,7,8,9]),set([10,11,12,13,14,15]),set( [16,17,18,19])]
我已经找到了类似的解决方案,但是投票率最高的解决方案在我的大型测试数据上得出的结果不正确。
答案 0 :(得分:4)
如果没有足够大的集合,很难说出性能,但这里有一些基本代码:
while True:
merged_one = False
supersets = [listOfSets[0]]
for s in listOfSets[1:]:
in_super_set = False
for ss in supersets:
if s & ss:
ss |= s
merged_one = True
in_super_set = True
break
if not in_super_set:
supersets.append(s)
print supersets
if not merged_one:
break
listOfSets = supersets
这对提供的数据进行了3次迭代。输出如下:
[set([0, 1, 2, 3, 4, 5]), set([4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]
答案 1 :(得分:2)
答案 2 :(得分:1)
原谅乱糟糟的上限(自动更正......):
# the results cotainer
Connected = set()
sets = # some list of sets
# convert the sets to frozensets (which are hashable and can be added to sets themselves)
Sets = map(frozenset, sets)
for s1 in sets:
Res = copy.copy(s1)
For s2 in sets:
If s1 & s2:
Res = res | s2
Connected.add(res)
答案 3 :(得分:0)
所以..我想我明白了。这是一团糟,但我明白了。这是我做的:
def connected_valid(li):
for i, l in enumerate(li):
for j, k in enumerate(li):
if i != j and contains(l,k):
return False
return True
def contains(set1, set2):
for s in set1:
if s in set2:
return True
return False
def combine(set1, set2):
set2 |= set1
return set2
def connect_sets(li):
while not connected_valid(li):
s1 = li.pop(0)
s2 = li[0]
if contains(s1, s2):
li[0] = combine(s1,s2)
else:
li.append(s1)
return li
然后在main函数中你会做这样的事情:
setA = set([0,1,2])
setB = set([6,7,8,9])
setC = set([4,5,6])
setD = set([3,4,5,0])
setE = set([10,11,12])
setF = set([11,13,14,15])
setG = set([16,17,18,19])
connected_sets = connect_sets([setA,setB,setC,setD,setE,setF,setG,])
运行后,我得到以下输出
print connected_sets
[set([0,1,2,3,4,5,6,7,8,9]), set([10,11,12,13,14,15]), set([16,17,18,19])]
希望这就是你要找的东西。
编辑:添加了随机生成集的代码:
# Creates a list of 4000 sets with a random number of values ranging from 0 to 20000
sets = []
ma = 0
mi = 21000
for x in range(4000):
rand_num = sample(range(20),1)[0]
tmp_set_li = sample(range(20000), rand_num)
sets.append(set(tmp_set_li))
如果你真的想要,最后3行可以压缩成一行。
答案 4 :(得分:0)
我试着做一些不同的事情:这个算法为每个集合循环一次,每个元素循环一次:
# Our test sets
setA = set([0,1,2])
setB = set([6,7,8,9])
setC = set([4,5,6])
setD = set([3,4,5,0])
setE = set([10,11,12])
setF = set([11,13,14,15])
setG = set([16,17,18,19])
list_of_sets = [setA,setB,setC,setD,setE,setF,setG]
# We will use a map to store our new merged sets.
# This map will work as an reference abstraction, so it will
# map set ids to the set or to other set id.
# This map may have an indirection level greater than 1
merged_sets = {}
# We will also use a map between indexes and set ids.
index_to_id = {}
# Given a set id, returns an equivalent set id that refers directly
# to a set in the merged_sets map
def resolve_id(id):
if not isinstance(id, (int, long)):
return None
while isinstance(merged_sets[id], (int, long)):
id = merged_sets[id]
return id
# Points the informed set to the destination id
def link_id(id_source, id_destination):
point_to = merged_sets[id_source]
merged_sets[id_source] = id_destination
if isinstance(point_to, (int, long)):
link_id(point_to, id_destination)
empty_set_found = False
# For each set
for current_set_id, current_set in enumerate(list_of_sets):
if len(current_set) == 0 and empty_set_found:
continue
if len(current_set) == 0:
empty_set_found = True
# Create a set id for the set and place it on the merged sets map
merged_sets[current_set_id] = current_set
# For each index in the current set
possibly_merged_current_set = current_set
for index in current_set:
# See if the index is free, i.e., has not been assigned to any set id
if index not in index_to_id:
# If it is free, then assign the set id to the index
index_to_id[index] = current_set_id
# ... and then go to the next index
else:
# If it is not free, then we may need to merge the sets
# Find out to which set we need to merge the current one,
# ... dereferencing if necessary
id_to_merge = resolve_id(index_to_id[index])
# First we check to see if the assignment is to the current set or not
if id_to_merge == resolve_id(merged_sets[current_set_id]):
continue
# Merge the current set to the one found
print 'Merging %d with %d' % (current_set_id, id_to_merge)
merged_sets[id_to_merge] |= possibly_merged_current_set
possibly_merged_current_set = merged_sets[id_to_merge]
# Map the current set id to the set id of the merged set
link_id(current_set_id, id_to_merge)
# Return all the sets in the merged sets map (ignore the references)
print [x for x in merged_sets.itervalues() if not isinstance(x, (int, long))]
打印:
Merging 2 with 1
Merging 3 with 0
Merging 3 with 1
Merging 5 with 4
[set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), set([10, 11, 12, 13, 14, 15]), set([16, 17, 18, 19])]