Question

Kahn提出了一种算法，在62到topologically sort任何DAG（有向无环图），从维基百科复制的伪代码：

L ← Empty list that will contain the sorted elements 
S ← Set of all nodes with no incoming edges 
while S is non-empty do
    remove a node n from S
    add n to tail of L
    for each node m with an edge e from n to m do
        remove edge e from the graph  # This is a DESTRUCTIVE step!
        if m has no other incoming edges then
            insert m into S if graph has edges then
    return error (graph has at least one cycle) else 
    return L (a topologically sorted order)

我需要使用IPython3实现它，并使用DAG的以下实现：

class Node(object):
    def __init__(self, name, parents):
        assert isinstance(name, str)
        assert all(isinstance(_, RandomVariable) for _ in parents)
        self.name, self.parents = name, parents

其中name是节点的标签，parents存储其所有父节点。然后DAG类实现为：

class DAG(object):
    def __init__(self, *nodes):
        assert all(isinstance(_, Node) for _ in nodes)
        self.nodes = nodes

（DAG实现是固定的，不需要改进。）然后我需要将Kahn算法实现为函数top_order，该函数接收DAG实例并返回类似{的排序{1}}。主要的麻烦是，这个算法是破坏性的，因为它的一个步骤是(node_1, node_2, ..., node_n)（第5行），它将删除remove edge e from the graph的一个成员。但是，我必须完整保留DAG实例。

到目前为止，我能想到的一种方法是创建一个接收到的DAG实例的深副本（即使浅拷贝也无法完成这项工作，因为该算法仍然会破坏原始实例参考），并在此副本上执行破坏性算法，然后获得此副本的节点名称的正确排序（假设节点之间没有命名冲突），然后使用此命名顺序来推断节点的正确排序原始实例，大致如下：

m.parents

两个问题：首先，当def top_order(network): '''takes in a DAG, prints and returns a topological ordering.''' assert type(network) == DAG temp = copy.deepcopy(network) # to leave the original instance intact ordering_name = [] roots = [node for node in temp.nodes if not node.parents] while roots: n_node = roots[0] del roots[0] ordering_name.append(n_node.name) for m_node in temp.nodes: if n_node in m_node.parents: temp_list = list(m_node.parents) temp_list.remove(n_node) m_node.parents = tuple(temp_list) if not m_node.parents: roots.append(m_node) print(ordering_name) # print ordering by name # gets ordering of nodes of the original instance ordering = [] for name in ordering_name: for node in network.nodes: if node.name == name: ordering.append(node) return tuple(ordering)很大时，深层副本将消耗资源;第二，我想改进我的嵌套network循环，它获得原始实例的排序。（对于第二个，我觉得像for方法之类的东西会出现在我脑海中。）

有什么建议吗？

Answer 1

我将建议算法的字面实现：你根本不需要操纵DAG，你只需要操纵关于 DAG的信息。算法需要的唯一“有趣”的东西是从节点到其子节点的映射（与DAG实际存储的相反），以及每个节点父节点的数量。

这些很容易计算，并且可以使用dicts将此信息与节点名称相关联（假设所有名称都是不同的 - 如果不是，您可以使用更多代码创建唯一名称）。

然后这应该有效：

def topsort(dag): name2node = {node.name: node for node in dag.nodes} # map name to number of predecessors (parents) name2npreds = {} # map name to list of successors (children) name2succs = {name: [] for name in name2node} for node in dag.nodes: thisname = node.name name2npreds[thisname] = len(node.parents) for p in node.parents: name2succs[p.name].append(thisname) result = [n for n, npreds in name2npreds.items() if npreds == 0] for p in result: for c in name2succs[p]: npreds = name2npreds[c] assert npreds npreds -= 1 name2npreds[c] = npreds if npreds == 0: result.append(c) if len(result) < len(name2node): raise ValueError("no topsort - cycle") return tuple(name2node[p] for p in result)

这里有一个微妙的要点：外部循环附加到result ，而则迭代result。那是故意的。结果是result中的每个元素都被外循环处理一次，无论元素是在初始result还是稍后添加。

请注意，虽然遍历了输入DAG和Node，但其中没有任何内容被更改。

使用Python实现Kahn的拓扑排序算法

1 个答案: