Chu - Liu Edmond的有向图最小生成树算法

时间:2014-06-02 06:08:06

标签: python algorithm graph tree

我想在加权有向图上找到最小生成树(MST)。我一直在尝试使用我在Python中实现的Chu-Liu/Edmond's algorithm(下面的代码)。可以找到简单,清晰的算法描述here。我有两个问题。

  1. Edmond的算法是否能保证收敛于解决方案?

    我担心删除一个循环会增加另一个循环。如果发生这种情况,算法将继续尝试永久删除循环。

    我似乎找到了一个这样的例子。输入图如下所示(在代码中)。该算法永远不会完成,因为它在周期[1,2]和[1,3]以及[5,4]和[5,6]之间切换。添加到图中的边以解决周期[5,4]创建周期[5,6],反之亦然,对于[1,2]和[1,3]也类似。

    我应该注意到,我不确定我的实施是否正确。

  2. 为解决此问题,我介绍了一个临时补丁。当删除边以移除循环时,我会从我们正在搜索MST的基础图G中永久删除该边。因此,不能再添加该边缘,这应该可以防止算法卡住。有了这个改变,我保证找到一个MST吗?

    我怀疑人们可以找到一个病态案例,这一步骤会导致结果不是MST,但我无法想到一个。它似乎适用于我尝试的所有简单测试用例。

  3. 代码:

    import sys
    
    # --------------------------------------------------------------------------------- #
    
    def _reverse(graph):
        r = {}
        for src in graph:
            for (dst,c) in graph[src].items():
                if dst in r:
                    r[dst][src] = c
                else:
                    r[dst] = { src : c }
        return r
    
    # Finds all cycles in graph using Tarjan's algorithm
    def strongly_connected_components(graph):
        """
        Tarjan's Algorithm (named for its discoverer, Robert Tarjan) is a graph theory algorithm
        for finding the strongly connected components of a graph.
    
        Based on: http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
        """
    
        index_counter = [0]
        stack = []
        lowlinks = {}
        index = {}
        result = []
    
        def strongconnect(node):
            # set the depth index for this node to the smallest unused index
            index[node] = index_counter[0]
            lowlinks[node] = index_counter[0]
            index_counter[0] += 1
            stack.append(node)
    
            # Consider successors of `node`
            try:
                successors = graph[node]
            except:
                successors = []
            for successor in successors:
                if successor not in lowlinks:
                    # Successor has not yet been visited; recurse on it
                    strongconnect(successor)
                    lowlinks[node] = min(lowlinks[node],lowlinks[successor])
                elif successor in stack:
                    # the successor is in the stack and hence in the current strongly connected component (SCC)
                    lowlinks[node] = min(lowlinks[node],index[successor])
    
            # If `node` is a root node, pop the stack and generate an SCC
            if lowlinks[node] == index[node]:
                connected_component = []
    
                while True:
                    successor = stack.pop()
                    connected_component.append(successor)
                    if successor == node: break
                component = tuple(connected_component)
                # storing the result
                result.append(component)
    
        for node in graph:
            if node not in lowlinks:
                strongconnect(node)
    
        return result
    
    def _mergeCycles(cycle,G,RG,g,rg):
        allInEdges = [] # all edges entering cycle from outside cycle
        minInternal = None
        minInternalWeight = sys.maxint
    
        # Find minimal internal edge weight
        for n in cycle:
            for e in RG[n]:
                if e in cycle:
                    if minInternal is None or RG[n][e] < minInternalWeight:
                        minInternal = (n,e)
                        minInternalWeight = RG[n][e]
                        continue
                else:
                    allInEdges.append((n,e)) # edge enters cycle
    
        # Find the incoming edge with minimum modified cost
        # modified cost c(i,k) = c(i,j) - (c(x_j, j) - min{j}(c(x_j, j)))
        minExternal = None
        minModifiedWeight = 0
        for j,i in allInEdges: # j is vertex in cycle, i is candidate vertex outside cycle
            xj, weight_xj_j = rg[j].popitem() # xj is vertex in cycle that currently goes to j
            rg[j][xj] = weight_xj_j # put item back in dictionary
            w = RG[j][i] - (weight_xj_j - minInternalWeight) # c(i,k) = c(i,j) - (c(x_j, j) - min{j}(c(x_j, j)))
            if minExternal is None or w <= minModifiedWeight:
                minExternal = (j,i)
                minModifiedWeight = w
    
        w = RG[minExternal[0]][minExternal[1]] # weight of edge entering cycle
        xj,_ = rg[minExternal[0]].popitem() # xj is vertex in cycle that currently goes to j
        rem = (minExternal[0], xj) # edge to remove
        rg[minExternal[0]].clear() # popitem() should delete the one edge into j, but we ensure that
    
        # Remove offending edge from RG
        # RG[minExternal[0]].pop(xj, None) #highly experimental. throw away the offending edge, so we never get it again
    
        if rem[1] in g:
            if rem[0] in g[rem[1]]:
                del g[rem[1]][rem[0]]
        if minExternal[1] in g:
            g[minExternal[1]][minExternal[0]] = w
        else:
            g[minExternal[1]] = { minExternal[0] : w }
    
        rg = _reverse(g)
    
    # --------------------------------------------------------------------------------- #
    
    def mst(root,G):
        """ The Chu-Liu/Edmond's algorithm
    
        arguments:
    
        root - the root of the MST
        G - the graph in which the MST lies
    
        returns: a graph representation of the MST
    
        Graph representation is the same as the one found at:
        http://code.activestate.com/recipes/119466/
    
        Explanation is copied verbatim here:
    
        The input graph G is assumed to have the following
        representation: A vertex can be any object that can
        be used as an index into a dictionary.  G is a
        dictionary, indexed by vertices.  For any vertex v,
        G[v] is itself a dictionary, indexed by the neighbors
        of v.  For any edge v->w, G[v][w] is the length of
        the edge.
        """
    
        RG = _reverse(G)
    
        g = {}
        for n in RG:
            if len(RG[n]) == 0:
                continue
            minimum = sys.maxint
            s,d = None,None
    
            for e in RG[n]:
                if RG[n][e] < minimum:
                    minimum = RG[n][e]
                    s,d = n,e
    
            if d in g:
                g[d][s] = RG[s][d]
            else:
                g[d] = { s : RG[s][d] }
    
        cycles = [list(c) for c in strongly_connected_components(g)]
    
        cycles_exist = True
        while cycles_exist:
    
            cycles_exist = False
            cycles = [list(c) for c in strongly_connected_components(g)]
            rg = _reverse(g)
    
            for cycle in cycles:
    
                if root in cycle:
                    continue
    
                if len(cycle) == 1:
                    continue
    
                _mergeCycles(cycle, G, RG, g, rg)
                cycles_exist = True
    
        return g
    
    # --------------------------------------------------------------------------------- #
    
    if __name__ == "__main__":
    
        # an example of an input that works
        root = 0
        g = {0: {1: 23, 2: 22, 3: 22}, 1: {2: 1, 3: 1}, 3: {1: 1, 2: 0}}
    
        # an example of an input that causes infinite cycle
        root = 0
        g = {0: {1: 17, 2: 16, 3: 19, 4: 16, 5: 16, 6: 18}, 1: {2: 3, 3: 3, 4: 11, 5: 10, 6: 12}, 2: {1: 3, 3: 4, 4: 8, 5: 8, 6: 11}, 3: {1: 3, 2: 4, 4: 12, 5: 11, 6: 14}, 4: {1: 11, 2: 8, 3: 12, 5: 6, 6: 10}, 5: {1: 10, 2: 8, 3: 11, 4: 6, 6: 4}, 6: {1: 12, 2: 11, 3: 14, 4: 10, 5: 4}}
    
        h = mst(int(root),g)
    
        print h
    
        for s in h:
            for t in h[s]:
                print "%d-%d" % (s,t)
    

2 个答案:

答案 0 :(得分:7)

不要做临时补丁。我承认实现收缩/收缩逻辑并不直观,并且在某些情况下递归是不可取的,因此这是一个适当的Python实现,可以使生产质量。我们不是在每个递归级别执行非约束步骤,而是将其推迟到最后并使用深度优先搜索,从而避免递归。 (这种修改的正确性最终来自互补松弛,是线性规划理论的一部分。)

下面的命名约定是_rep表示超级节点(即一个或多个签约节点的块)。

#!/usr/bin/env python3
from collections import defaultdict, namedtuple


Arc = namedtuple('Arc', ('tail', 'weight', 'head'))


def min_spanning_arborescence(arcs, sink):
    good_arcs = []
    quotient_map = {arc.tail: arc.tail for arc in arcs}
    quotient_map[sink] = sink
    while True:
        min_arc_by_tail_rep = {}
        successor_rep = {}
        for arc in arcs:
            if arc.tail == sink:
                continue
            tail_rep = quotient_map[arc.tail]
            head_rep = quotient_map[arc.head]
            if tail_rep == head_rep:
                continue
            if tail_rep not in min_arc_by_tail_rep or min_arc_by_tail_rep[tail_rep].weight > arc.weight:
                min_arc_by_tail_rep[tail_rep] = arc
                successor_rep[tail_rep] = head_rep
        cycle_reps = find_cycle(successor_rep, sink)
        if cycle_reps is None:
            good_arcs.extend(min_arc_by_tail_rep.values())
            return spanning_arborescence(good_arcs, sink)
        good_arcs.extend(min_arc_by_tail_rep[cycle_rep] for cycle_rep in cycle_reps)
        cycle_rep_set = set(cycle_reps)
        cycle_rep = cycle_rep_set.pop()
        quotient_map = {node: cycle_rep if node_rep in cycle_rep_set else node_rep for node, node_rep in quotient_map.items()}


def find_cycle(successor, sink):
    visited = {sink}
    for node in successor:
        cycle = []
        while node not in visited:
            visited.add(node)
            cycle.append(node)
            node = successor[node]
        if node in cycle:
            return cycle[cycle.index(node):]
    return None


def spanning_arborescence(arcs, sink):
    arcs_by_head = defaultdict(list)
    for arc in arcs:
        if arc.tail == sink:
            continue
        arcs_by_head[arc.head].append(arc)
    solution_arc_by_tail = {}
    stack = arcs_by_head[sink]
    while stack:
        arc = stack.pop()
        if arc.tail in solution_arc_by_tail:
            continue
        solution_arc_by_tail[arc.tail] = arc
        stack.extend(arcs_by_head[arc.tail])
    return solution_arc_by_tail


print(min_spanning_arborescence([Arc(1, 17, 0), Arc(2, 16, 0), Arc(3, 19, 0), Arc(4, 16, 0), Arc(5, 16, 0), Arc(6, 18, 0), Arc(2, 3, 1), Arc(3, 3, 1), Arc(4, 11, 1), Arc(5, 10, 1), Arc(6, 12, 1), Arc(1, 3, 2), Arc(3, 4, 2), Arc(4, 8, 2), Arc(5, 8, 2), Arc(6, 11, 2), Arc(1, 3, 3), Arc(2, 4, 3), Arc(4, 12, 3), Arc(5, 11, 3), Arc(6, 14, 3), Arc(1, 11, 4), Arc(2, 8, 4), Arc(3, 12, 4), Arc(5, 6, 4), Arc(6, 10, 4), Arc(1, 10, 5), Arc(2, 8, 5), Arc(3, 11, 5), Arc(4, 6, 5), Arc(6, 4, 5), Arc(1, 12, 6), Arc(2, 11, 6), Arc(3, 14, 6), Arc(4, 10, 6), Arc(5, 4, 6)], 0))

答案 1 :(得分:0)

不能直接回答您的问题,但Edmond算法的以下递归实现似乎按预期工作:

graph.py

注意 mst()方法在此实现中返回最大生成树。希望这段代码可以作为参考来调整您的实现。