达到递归限制和分段错误

时间:2016-09-08 18:02:23

标签: algorithm python-2.7 recursion graph directed-acyclic-graphs

我知道这些问题已经被问了很多,而且没有一个问题对我有帮助。

在下面的问题中,我正在尝试使用巨大的来实现有向图的强连接组件。 这是我的代码。

import os 
import sys
os.system('cls')
sys.setrecursionlimit(22764)

from itertools import groupby
from collections import defaultdict

## Reading the data in adjacency list form
data = open("data.txt", 'r')
G = defaultdict( list )

for line in data:
    lst = [int(s) for s in line.split()]
    G[lst[0]].append( lst[1] ) 


print 'Graph has been read!'



def rev_graph(  ):
    revG = defaultdict( list )
    data = open( "data.txt", 'r' )    


    for line in data:
        lst = [ int(s) for s in line.split() ]
        revG[ lst[1] ].append( lst[0] ) 

    print 'Graph has been reversed!'
    return revG


class Track(object):
    """Keeps track of the current time, current source, component leader,
    finish time of each node and the explored nodes."""

    def __init__(self):
        self.current_time = 0
        self.current_source = None
        self.leader = {}
        self.finish_time = {}
        self.explored = set()


def dfs(graph_dict, node, track):
    """Inner loop explores all nodes in a SCC. Graph represented as a dict,
    {tail node: [head nodes]}. Depth first search runs recrusively and keeps
    track of the parameters"""

    # print 'In Recursion node is ' + str(node)
    track.explored.add(node)
    track.leader[node] = track.current_source
    for head in graph_dict[node]:


        if head not in track.explored:
                dfs(graph_dict, head, track)

    track.current_time += 1

    track.finish_time[node] = track.current_time


def dfs_loop(graph_dict, nodes, track):
    """Outter loop checks out all SCCs. Current source node changes when one
    SCC inner loop finishes."""

    for node in nodes:

        if node not in track.explored:


            track.current_source = node
            dfs(graph_dict, node, track)


def scc(graph, nodes):
    """First runs dfs_loop on reversed graph with nodes in decreasing order,
    then runs dfs_loop on orignial graph with nodes in decreasing finish
    time order(obatined from firt run). Return a dict of {leader: SCC}."""

    out = defaultdict(list)
    track = Track()

    reverse_graph = rev_graph(  )


    global G
    G = None

    dfs_loop(reverse_graph, nodes, track) ## changes here

    sorted_nodes = sorted(track.finish_time,
                          key=track.finish_time.get, reverse=True)

    # print sorted_nodes
    track.current_time = 0
    track.current_source = None
    track.explored = set()

    reverse_graph = None


    dfs_loop(graph, sorted_nodes, track)
    for lead, vertex in groupby(sorted(track.leader, key=track.leader.get),
                                key=track.leader.get):
        out[lead] = list(vertex)
    return out


maxNode = max( G.keys() )   
revNodes = list( reversed( range( 1, ( maxNode + 1 ) ) ) )

ans = scc( G, revNodes )
print 'naman'
print ans

现在,在此递归限制下,我收到了分段错误(Core Dumped)错误。低于此限制,我得到'cmp'中超出的最大递归深度错误。

我还附加了数据文件。这是link

1 个答案:

答案 0 :(得分:1)

Rakete1111给出了基本原则:不要使用递归。您可以轻松维护探索和等待的全局节点列表;事实上,你已经花了很多开销来传递你的方法。

如果您希望尝试快速实现此功能,请首先将跟踪设为全局。现在,您正在传递一个唯一的实例遍历例程 - 在每次调用时,您都必须实例化一个本地副本,这会占用大量存储空间。

此外,每次通话都会导致相对较重的内存损失,因为您将状态列表传递到下一个通话级别。如果用循环替换你的递归来说“当列表不为空”时,你将能够节省大量内存和所有那些递归调用。你能放松一下吗?如果您需要编码帮助,请发表评论。