我想基于类似的属性对元组进行分组

时间:2012-12-10 20:34:55

标签: python list tuples

我有一个元组列表。 [(1,2),(2,3),(4,3),(5,6),(6,7),(8,2)]

我想根据连接的元组(具有相关值)将它们分组到列表中。

因此最终结果是两个相关元组值列表= [[1,2,3,4,8],[5,6,7]]

如何编写函数来执行此操作?这是一份面试问题。我试图在Python中做到这一点,但我很沮丧,只是想看看答案背后的逻辑,所以即使是伪代码也会帮助我,所以我可以看到我做错了什么。

我现场只有几分钟的时间来做这件事,但这是我试过的:

def find_partitions(connections):
 theBigList = []     # List of Lists
 list1 = []          # The initial list to store lists
 theBigList.append(list1)

 for list in theBigList:
 list.append(connection[1[0], 1[1]])
     for i in connections:
         if i[0] in list or i[1] in list:
             list.append(i[0], i[1])

         else:
             newList = []
             theBigList.append(newList)

基本上,这个人想要一份相关价值列表清单。 我尝试使用for循环,但意识到它不起作用,然后时间耗尽。

5 个答案:

答案 0 :(得分:2)

在我们填写组件时,每个阶段都有三种情况需要考虑(因为必须匹配重叠的组):

  1. x或y都不在已找到的任何组件中。
  2. 两者都已经在不同的集合中,x在set_i中,y在set_j中。
  3. 其中一个或两个都在一个组件中,x在set_i中,y在set_i中。
  4. 我们可以使用内置的set来提供帮助。 (参见@jwpat和@ DSM的棘手例子)

    def connected_components(lst):
        components = [] # list of sets
        for (x,y) in lst:
            i = j = set_i = set_j = None
            for k, c in enumerate(components):
                if x in c:
                    i, set_i = k, c
                if y in c:
                    j, set_j = k, c
    
            #case1 (or already in same set)
            if i == j:
                 if i == None:
                     components.append(set([x,y]))
                 continue
    
            #case2
            if i != None and j != None:
                components = [components[k] for k in range(len(components)) if k!=i and k!=j]
                components.append(set_i | set_j)
                continue
    
            #case3
            if j != None:
                components[j].add(x)
            if i != None:
                components[i].add(y)
    
        return components               
    
    lst = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
    connected_components(lst)
    # [set([8, 1, 2, 3, 4]), set([5, 6, 7])]
    map(list, connected_components(lst))
    # [[8, 1, 2, 3, 4], [5, 6, 7]]
    
    connected_components([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
    # [set([8, 1, 2, 3, 4]), set([5, 6, 7])] # @jwpat's example
    
    connected_components([[1, 3], [2, 4], [3, 4]]
    # [set([1, 2, 3, 4])] # @DSM's example
    

    这肯定不是最有效的方法,但可能与他们期望的类似。 正如Jon Clements指出的那样,有一个用于这类计算的库:networkx,它们会更有效率。

答案 1 :(得分:1)

l = [ (1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2) ]

# map each value to the corresponding connected component
d = {}
for i, j in l:
  di = d.setdefault(i, {i})
  dj = d.setdefault(j, {j})
  if di is not dj:
    di |= dj
    for k in dj:
      d[k] = di

# print out the connected components
p = set()
for i in d.keys():
  if i not in p:
    print(d[i])
  p |= d[i]

答案 2 :(得分:0)

这当然不优雅,但它有效:

def _grouper(s,ll):
    for tup in ll[:]:
        if any(x in s for x in tup):
            for y in tup:
                s.add(y)
                ll.remove(tup)

def grouper(ll,out=None):
    _ll = ll[:]
    s = set(ll.pop(0))
    if out is None:
        out = [s]
    else:
        out.append(s)

    l_old = 0
    while l_old != len(_ll):
        l_old = len(_ll)
        _grouper(s,_ll)

    if _ll:
        return grouper(_ll,out=out)
    else:
        return out

ll = [ (1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2) ]
print grouper(ll)

答案 3 :(得分:0)

使用sets

In [235]: def func(ls):
    new_lis=sorted(sorted(ls),key=min) 
    lis=[set(new_lis[0])]
    for x in new_lis[1:]:
            for y in lis:
                    if not set(x).isdisjoint(y):
                            y.update(x);break 
            else:lis.append(set(x))
    return lis
   .....: 

In [236]: func([(3, 1), (9, 3), (6, 9)])
Out[236]: [set([1, 3, 6, 9])]

In [237]: func([[2,1],[3,0],[1,3]])
Out[237]: [set([0, 1, 2, 3])]

In [239]: func([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
Out[239]: [set([8, 1, 2, 3, 4]), set([5, 6, 7])]

答案 4 :(得分:0)

怎么样

ts = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
ss = []
while len(ts) > 0:
    s = set(ts.pop())
    ol = 0
    nl = len(s)
    while ol < nl:
        for t in ts:
            if t[0] in s or t[1] in s: s = s.union(ts.pop(ts.index(t)))
        ol = nl
        nl = len(s)
    ss.append(s)

print ss