Question

我正在尝试将所有连接的组件放入图表中并将其打印出来。我将浏览图形的每个节点，并从该节点开始执行深度优先搜索（DFS）。这是我的代码：

graph = {
'a': ['b'],
'b': ['c'],
'c': ['d'],
'd': [],
'e': ['f'],
'f': []
}

def DFS(graph, start_node, stack = [], visited = []):
    stack.append(start_node)

    while stack:
        v = stack.pop()
        if v not in visited:
            visited.append(v)
            for neighbor in graph[v]:
                stack.append(neighbor)
    return visited



def dfs_get_connected_components_util(graph):
    visited = []

    for node in graph:
        if node not in visited:
            DFS_algo = DFS(graph, node)
            print(DFS_algo)
            visited = DFS_algo

print(dfs_get_connected_components_util(graph))

根据我的图表，有两个连接的组件，a - ＆gt; b - ＆gt; c - ＆gt; d 和e - ＆gt; ˚F

相反，我得到以下打印输出：

['c', 'd']
['c', 'd', 'a', 'b']
['c', 'd', 'a', 'b', 'f']
['c', 'd', 'a', 'b', 'f', 'e']

我似乎无法弄清楚我在连接组件功能中做错了什么。我想这可能更像是一个蟒蛇问题。

Answer 1

这就是我想出来的。我在内联添加了一些注释来解释我的所作所为。为了清晰起见，有些东西被转移到全球。我通常不建议使用全局变量。

关键是理解递归，并记住在分配对象（不是文字）时，只分配引用，不它的副本。

请注意，此解决方案假设图表无向。请在下面的注释部分查看更多详细信息。

随意要求澄清。

from collections import defaultdict

graph = {
    'a': ['b'],
    'b': ['c'],
    'c': ['d'],
    'd': [],
    'e': ['f'],
    'f': []
}

connected_components = defaultdict(set)


def dfs(node):
    """
    The key is understanding the recursion
    The recursive assumption is:
        After calling `dfs(node)`, the `connected_components` dict contains all the connected as keys,
        and the values are *the same set* that contains all the connected nodes.
    """
    global connected_components, graph
    if node not in connected_components:
        # this is important, so neighbors won't try to traverse current node
        connected_components[node] = set()
        for next_ in graph[node]:
            dfs(next_)
            # according the recursive assumption, connected_component of `next_` is also the one of `node`
            connected_components[node] = connected_components[next_]

        # all that's left is add the current node
        connected_components[node].add(node)

for node_ in graph:
    dfs(node_)


# get all connected components and convert to tuples, so they are hashable
connected_comp_as_tuples = map(tuple, connected_components.values())

# use ``set`` to make the list of connected components distinct (without repetition)
unique_components = set(connected_comp_as_tuples)
print(unique_components)

备注

当然没有经过彻底的测试......你应该尝试使用不同的图形（带有循环，单节点组件等）。
代码可能会得到改进（在性能和清晰度方面）。例如，我们为每个节点创建一个set，即使我们确实不需要一个节点（当节点有邻居时，该集合是冗余的并且将被覆盖）。
在OP的原始代码中，他使用mutable default arguments。这是一个很大的否定（除非你真的真的知道你在做什么），如上所述，可能会导致这些问题。但不是这次......
考虑@ kenny-ostroms对这个问题的评论，关于定义的一个词（与Python无关）：连接组件仅与无向图相关。对于有向图，术语是强连接组件。概念是相同的 - 对于这样一个组件中的每两个节点（有向或无向），这两个节点之间有一条路径。因此，即使节点“b”可以从“a”到达，如果“a”无法从“b”到达（这可能仅在有向图中发生），“a”和“b”将不共享连接的组件。对于有向图，我的解决方案无效。解决方案假设图表可以被视为无向（换句话说，如果'b'是'a的邻居，我们假设'a'是'b的邻居）。

在图表中获取连接的组件

1 个答案:

备注