我的Ruby版本的Tarjan算法中的错误

时间:2014-04-18 05:06:18

标签: ruby tarjans-algorithm

http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm

http://en.algoritmy.net/article/44220/Tarjans-algorithm

我无法在我的Ruby版本的Tarjan算法中找出这个关于强连接组件的错误。我得到了Kosaraju-Sharir算法,我在Ruby中的Tarjan算法适用于某些图形。但它没有连接两个应该连接的组件---“10”和“11,12,9”

输入文件是此有向图:http://algs4.cs.princeton.edu/42directed/tinyDG.txt

expected: [["1"], ["0", "2", "3", "4", "5"], ["10", "11", "12", "9"], ["6", "8"], ["7"]]
got: [["1"], ["0", "2", "3", "4", "5"], ["10"], ["11", "12", "9"], ["6", "8"], ["7"]]

在最后一个尝试制作单个组件的循环中,它以“10”(堆栈的最后一项)开头,但是当前顶点(“父”)也是“10”!这使得循环切断“10”作为单独的组件。为什么堆栈中的最新项目与父节点相同?在我们收集[“12”,“11”,“9”......然后“10”]之后,我希望“10”只出现在组件的END处。因为“10”首先出现,而不是最后出现,所以我们遇到了这个问题。我该如何解决?

  begin
    last_stack_item = stack.pop
    component << last_stack_item.name
  end while last_stack_item != parent # we're back at the root

我的Ruby代码:

    # Tarjan's algorithm to find all strongly connected components (SCCs)
    def scc_tarjan
      index = 0 # numbers nodes consecutively in the order discovered
      stack, scc, vertices = [], [], []

      # create new Struct, if not already defined
      if Struct::const_defined?("TarjanVertex")
        Struct.const_get("TarjanVertex")
      else
        Struct.new("TarjanVertex", :name, :index, :lowlink)
      end

      adj_lists.each do |v|
        # -1 means vertex is unvisited
        vertex = Struct::TarjanVertex.new(v.name, -1, -1)
        vertices << vertex  # array of all TarjanVertex objects in graph
      end
      vertices.each do |vertex|
        tarjan_dfs(vertex, scc, stack, index, vertices) if vertex.index == -1
      end
      # return nested array of all SCCs in graph
      scc
    end

  def tarjan_dfs(parent, scc, stack, index, vertices)
    # Set depth index for vertex to smallest unused index
    parent.index = index
    # lowlink is roughly the smallest index of any node known to be reachable from the vertex
    parent.lowlink = index
    index += 1
    stack << parent
    # loop through all vertices connected to parent
    adj_vertices(parent.name, adj_lists).each do |adj_vertex|
      # since adj_vertices returns array of strings,
      # must convert to TarjanVertex objects
      child = vertices.select {|v| v.name == adj_vertex}.first

      if child.index == -1  # if child vertex not yet visited
        tarjan_dfs(child, scc, stack, index, vertices) # recurse on child

        # change parent's lowlink to smaller lowlink of parent and child)
        parent.lowlink = [parent.lowlink, child.lowlink].min

      # vertex points to earlier (already visited) one in stack,
      # with lower index. thus it's the current SCC
      elsif stack.include?(child)
        parent.lowlink = [parent.lowlink, child.index].min
      end
    end

    # if a vertex's lowlink = its index here, this # cannot go any lower.
    # vertex MUST be root of the SCC.
    if parent.lowlink == parent.index
      component = []  # a single SCC

      # pop off entire SCC, one vertex at a time
      begin
        last_stack_item = stack.pop
        component << last_stack_item.name
      end while last_stack_item != parent # we're back at the root
      scc << component.sort # done with a single SCC
    end
  end

2 个答案:

答案 0 :(得分:1)

我解决了自己的问题!在经过我的代码的每个循环后,用笔和纸,我发现它过早地转到了顶点4处的底部组件循环。在这一点上,parent.lowlink不应该等于parent.index。我只需要换一个字来解决我的问题!

我改变了#34; child.index&#34; to&#34; child.lowlink&#34;在&#34; elsif stack.include?(child)&#34;环!这正确地删除了4的低位链接以匹配顶点6的低位链接。

从那以后,parent.lowlink!= parent.index,它不会过早地开始制作新组件。

有趣的是,我的解决方案与我在Tarjan算法上找到的所有伪代码和在线代码不同,后者都说'#34; parent.lowlink = [parent.lowlink,child.index] .min&#34;

相反,我需要&#34; parent.lowlink = [parent.lowlink,child.lowlink] .min&#34;

答案 1 :(得分:-1)

index是深度优先搜索的时间戳,这意味着每次dfs()到达未访问的顶点时,其值应增加1.因此,每个节点都为index值应该是不同的,当算法完成时,index的值应该等于图中顶点的数量。

但您将index作为参数传递给函数tarjan_dfs。由于它是按值传递的,因此在dfs()index += 1中只更改了index的副本。因此,index将是dfs-tree的深度(由深度优先搜索的跨越形成的树)。这是错误的来源。

因此,使用全局变量$index而不是局部变量index将修复该错误。事实上,问题开头列出的所有代码都使用index作为全局变量。

如果您不想使用全局变量,并且仍希望实现相同的效果,则可以使用可变对象来包装它。例如:

  • index = 0更改为index = {value: 0}
  • parent.index = index更改为parent.index = index[:value]
  • parent.lowlink = index更改为parent.lowlink = index[:value]
  • index += 1更改为index[:value] += 1

这是我的可运行的Ruby实现,使用随机图生成器,它将比较两个过程的输出。只是希望它会有用。

# My version:
def tarjan_scc(adj)
  n = adj.size
  dfn = Array.new(n, -1) # dfn[u] is the timestamp when dfs reached node u
  low = Array.new(n, -1) # low[u] is the lowest index that u or u's children can reach in at most one step
  index = {value: 0}
  stk, sccs = [], []
  (0...n).each do |u|
    tarjan_scc_dfs(adj, u, index, dfn, low, stk, sccs) if dfn[u] == -1
  end
  sccs.sort!
end

def tarjan_scc_dfs(adj, u, index, dfn, low, stk, sccs)
  dfn[u] = low[u] = index[:value]
  index[:value] += 1
  stk.push(u)
  adj[u].each do |v|
    if dfn[v] == -1
      tarjan_scc_dfs(adj, v, index, dfn, low, stk, sccs)
      low[u] = [low[u], low[v]].min
    elsif stk.include?(v)
      low[u] = [low[u], dfn[v]].min
    end
  end
  if dfn[u] == low[u]
    scc = []
    scc << stk.pop while stk[-1] != u
    sccs << scc.push(stk.pop).sort
  end
end


# Test version, with these two changes:
# 1) change Hash `index` to Fixnum `index`
# 2) change `low[u] = [low[u], dfn[v]].min` to `low[u] = [low[u], low[v]].min`
def tarjan_scc_dfs_test(adj, u, index, dfn, low, stk, sccs)
  dfn[u] = low[u] = index
  index += 1
  stk.push(u)
  adj[u].each do |v|
    if dfn[v] == -1
      tarjan_scc_dfs_test(adj, v, index, dfn, low, stk, sccs)
      low[u] = [low[u], low[v]].min
    elsif stk.include?(v)
      low[u] = [low[u], low[v]].min
    end
  end
  if dfn[u] == low[u]
    scc = []
    scc << stk.pop while stk[-1] != u
    sccs << scc.push(stk.pop).sort
  end
end

def tarjan_scc_test(adj)
  n = adj.size
  dfn = Array.new(n, -1)
  low = Array.new(n, -1)
  index = 0
  stk, sccs = [], []
  (0...n).each do |u|
    tarjan_scc_dfs_test(adj, u, index, dfn, low, stk, sccs) if dfn[u] == -1
  end
  sccs.sort!
end


# Randomly generate a simple direct graph with at most max_n nodes
# Nodes are number 0 to max_n - 1. Edges stored adjacent list
def generate_graph(max_n)
  @rng ||= Random.new(Time.hash)
  n = @rng.rand(1..max_n)
  ed = []
  n.times do |i|
    n.times do |j|
      ed << [i, j] if i != j
    end
  end

  ed.size.times do |i|
    j = @rng.rand(i...ed.size)
    ed[i], ed[j] = ed[j], ed[i]
  end

  adj = Array.new(n) { Array.new }
  @rng.rand(0..ed.size).times do |i|
    u, v = ed[i]
    adj[u] << v
  end
  adj
end

# Main loop: generating random graphs and test two functions until answers differ from each other.
while true
  adj = generate_graph(8)
  sccs = tarjan_scc(adj)
  sccs_test = tarjan_scc_test(adj)
  if sccs != sccs_test
    puts "Graph: "
    adj.size.times do |u|
      puts "#{u}: #{adj[u]}"
    end
    puts "Correct components output:"
    p sccs
    puts "Wrong components output by text program:"
    p sccs_test
    break
  end
end

<强>更新

这是您在此测试用例中修改算法的步骤(只要我理解您的算法正确):

0->1, 1->2, 2->1, 1->0, 0->3, 3->2.
init: [[index=0, lowlink=0], [index=-1, lowlink=-1], [index=-1, lowlink=-1], [index=-1, lowlink=-1]]
0->1: [[index=0, lowlink=0], [index=1, lowlink=1], [index=-1, lowlink=-1], [index=-1, lowlink=-1]]
1->2: [[index=0, lowlink=0], [index=1, lowlink=1], [index=2, lowlink=2], [index=-1, lowlink=-1]]
2->1: [[index=0, lowlink=0], [index=1, lowlink=1], [index=2, lowlink=1], [index=-1, lowlink=-1]]
(return from node 2, now parent is node 1)
1->0: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=-1, lowlink=-1]]
(attention: when visit each edge in this order, node 2's lowlink will be 1, not 0)
(return from node 1, now parent is node 0)
0->3: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=1, lowlink=1]]
3->2: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=1, lowlink=1]]
(node 2's lowlink is 1, so node 3's lowlink will not be 0, and node 3 will be marked as a single SCC)
(return from node 3, now parent is node 0)
(mark [0, 1, 2] as a SCC)

如您所见,边缘访问的顺序将会有不同的答案。但它在tarjans算法中无关紧要。