鉴于:
g=[[], [], [0, 2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7]]
如何比较g中的每个列表,以便对于共享的列表,任何公共号码都可以合并到一组?
e.g。
0
和g[2]
中存在g[4]
所以他们合并到一组{0,2,3,7}
我尝试过以下操作但不起作用:
for i in g:
for j in g:
if k in i == l in j:
m=set(i+j)
我想做出最大可能的设定。
答案 0 :(得分:1)
这是一个快速列表,它将列出所有相交的集合:
sets = [set(i+j) for i in g for j in g if i!=j and (set(i) & set(j))]
请注意,每个结果都会重复,因为每个列表都会被比较两次,一次在左边,一次在右边。
答案 1 :(得分:1)
快得多方式您可以先创建len多个项目(s
)的项目列表。然后浏览您的列表并使用union
功能进行更新!
s=map(set,g)
def find_intersection(m_list):
for i,v in enumerate(m_list) :
for j,k in enumerate(m_list[i+1:],i+1):
if v & k:
m_list[i]=v.union(m_list.pop(j))
return find_intersection(m_list)
return m_list
演示:
g=[[], [], [0, 2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7]]
s=map(set,g)
print find_intersection(s)
[set([0, 2, 3, 7]), set([1, 4, 5, 6])]
g=[[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11]]
s=map(set,g)
print find_intersection(s)
[set([1, 2, 3, 4, 5, 6, 7]), set([9, 10, 11])]
g=[[], [1], [0,2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7]]
s=map(set,g)
print find_intersection(s)
[set([1, 4, 5, 6]), set([0, 2, 3, 7])]
与@Mark的回答基准:
from timeit import timeit
s1="""g=[[], [], [0, 2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7]]
sets = [set(i+j) for i in g for j in g if i!=j and (set(i) & set(j))]
"""
s2="""g=[[], [], [0, 2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7]]
s=map(set,g)
def find_intersection(m_list):
for i,v in enumerate(m_list) :
for j,k in enumerate(m_list[i+1:],i+1):
if v & k:
s[i]=v.union(m_list.pop(j))
return find_intersection(m_list)
return m_list
"""
print ' first: ' ,timeit(stmt=s1, number=100000)
print 'second : ',timeit(stmt=s2, number=100000)
first: 3.8284008503
second : 0.213887929916
答案 2 :(得分:1)
如果g
或g
的元素很大,您可以使用不相交集来提高效率。
此数据结构可用于将每个元素分类到它应属于的集合中。
第一步是构建一个Disjoint Set集合,其中所有g
个集合都用g
中的索引标记:
g=[[], [], [0, 2], [1, 5], [0, 2, 3, 7], [4, 6], [1, 4, 5, 6], [], [], [3, 7],[99]]
g = map(set, g)
dss = CDisjointSets()
for i in xrange(len(g)):
dss.MakeSet(i)
然后,每当交集不为空时,集合就会加入:
for i in xrange(len(g)):
for j in xrange(i+1, len(g)):
if g[i].intersection(g[j]):
dss.Join(i,j)
此时dss
为您提供了应该加在一起的g
套的公共标签:
print(dss)
父(0)= 0 parent(1)= 1 parent(2)= 2 parent(3)= 3 parent(4)= 2 parent(5)= 3 parent(6)= 3 parent(7)= 7 parent(8)= 8 parent(9)= 2 parent(10)= 10
现在你只需构建新的集合,加入那些具有相同标签的集合:
l2set = dict()
for i in xrange(len(g)):
label = dss.FindLabel(i).getLabel()
l2set[label] = l2set.get(label, set()).union(g[i])
print(l2set)
导致:
{0: set([]), 1: set([]), 2: set([0, 2, 3, 7]), 3: set([1, 4, 5, 6]), 7: set([]), 8: set([]), 10: set([99])}
这是我使用的Disjoint Sets的实现,但你肯定可以找到另一个更好的sintax:
""" Disjoint Sets
-------------
Pablo Francisco Pérez Hidalgo
December,2012. """
class CDisjointSets:
#Class to represent each set
class DSet:
def __init__(self, label_value):
self.__label = label_value
self.rank = 1
self.parent = self
def getLabel(self):
return self.__label
#CDisjointSets Private attributes
__sets = None
#CDisjointSets Constructors and public methods.
def __init__(self):
self.__sets = {}
def MakeSet(self, label):
if label in self.__sets: #This check slows the operation a lot,
return False #it should be removed if it is sure that
#two sets with the same label are not goind
#to be created.
self.__sets[label] = self.DSet(label)
#Pre: 'labelA' and 'labelB' are labels or existing disjoint sets.
def Join(self, labelA, labelB):
a = self.__sets[labelA]
b = self.__sets[labelB]
pa = self.Find(a)
pb = self.Find(b)
if pa == pb:
return #They are already joined
parent = pa
child = pb
if pa.rank < pb.rank:
parent = pb
child = pa
child.parent = parent
parent.rank = max(parent.rank, child.rank+1)
def Find(self,x):
if x == x.parent:
return x
x.parent = self.Find(x.parent)
return x.parent
def FindLabel(self, label):
return self.Find(self.__sets[label])
def __str__(self):
ret = ""
for e in self.__sets:
ret = ret + "parent("+self.__sets[e].getLabel().__str__()+") = "+self.FindLabel(e).parent.getLabel().__str__() + "\n"
return ret