python递归内存不足

时间:2013-11-26 22:26:21

标签: python memory recursion out-of-memory

当此代码运行时,OSX通知我我已经没有应用程序内存并暂停应用程序。 Python使用的空间量非常快地打破了10场演出。这段代码永远不会达到Python的最大递归级别,它应该只能达到525最坏的情况,但由于缓存它应该小得多。我有一种感觉,列表链正在被每个级别的递归复制,但似乎它是一个全局变量,应该与每次调用collat​​z()共享。我在stackoverflow上找了类似的问题,但我没有发现任何相同的问题。

# The following iterative sequence is defined for the set of positive integers:
#  n = n/2  (n is even)
#    = 3n+1 (n is odd)
# Using the rule above and starting with 13, we generate the following sequence:
#  13,40,20,10,5,16,8,4,2,1
# It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 
#  terms. Although it has not been proved yet (Collatz Problem), it is thought that
#  all starting numbers finish at 1.
# Which starting number, under one million, produces the longest chain?
# Comments: I'm using dynamic programming to avoid computing the same values over and over
#            again.

lim = 1000000
chain = [1,1]
maxL = 0

def collatz(i):
   if i >= len(chain): chain.extend([None]*(i-len(chain)+1))
   if chain[i] == None: chain[i] = 1 + collatz(i/2 if i%2==0 else 3*i+1)
   return chain[i]

for i in range(1, lim): 
   maxL = i if (collatz(i) > chain[maxL]) else maxL 
   print i, chain[maxL]
print "collatz chain of {} is {} terms long.".format(maxL, collatz(maxL))

编辑:在这里查看我的工作字典实现:https://stackoverflow.com/a/20229855/1084773

2 个答案:

答案 0 :(得分:3)

要查看内存错误,请使用limit = 100运行代码,然后打印出链。

也许你想序列化你的递归代码:

lengths = {1: 1}

def collatz(i):
    i0 = i
    acc = 0
    while True:
        if i in lengths:
            lengths[i0] = acc + lengths[i]
            return acc + lengths[i]
        acc += 1
        i = (i * 3 + 1) if i % 2 else i // 2

longest = 1
for i in range(1, 1000000):
    c = collatz(i)
    if c > longest:
        longest = c
        print(i, c)

这肯定可以在很多方面进行优化,但它会在4秒内产生预期的结果。


修改

您的方法会创建一个列表,其长度达到最高期限。对于limit = 100,这是9232.这不是那么多。但是对于limit = 1000000,它是56991483520(从704511开始的链),这是一个非常大的。如果它只是一个超过int32的数组,这已经是212 GB的内存,实际上它不仅仅是那个。


这里麻烦的链:704511,2113534,1056767,3170302,1585151,4755454,2377727,7133182,3566591,10699774,5349887,16049662,8024831,24074494,12037247,36111742,18055871,54167614,27083807,81251422,40625711, 121877134,60938567,182815702,91407851,274223554,137111777,411335332,205667666,102833833,308501500,154250750,77125375,231376126,115688063,347064190,173532095,520596286,260298143,780894430,390447215,1171341646,585670823,1757012470,878506235,2635518706, 1317759353,3953278060,1976639030,988319515,2964958546,1482479273,4447437820,2223718910,1111859455,3335578366,1667789183,5003367550,2501683775,7505051326,3752525663,11257576990,5628788495,16886365486,8443182743,25329548230,12664774115,37994322346,18997161173,的 56991483520 ,28495741760,14247870880,7123935440,3561967720,1780983860,890491930,445245965,1355737896,667868948,333934474,1666967237,5500901712,250450856,125225428,62612714,31306357,939 19072,46959536,23479768,11739884,5869942,2934971,8804914,4402457,13207372,6603686,3301843,9905530,4955265,14858296,7429148,3714574,1857287,5571862,2785931,8357794,4178897,12536692,6268346,3134173,9402520, 4701260,2350630,1175315,3525946,1762973,5288920,2644460,1322230,661115,1983346,991673,2975020,1487510,743755,2231266,11153333,3346900,1673450,836725,2510176,1255088,627544,313772,156886,78443, 235330,117665,352996,176498,88249,264748,132374,66187,198562,99281,297844,148922,74461,223384,111692,55846,27923,83770,418885,125656,62828,31414,15707,47122,23561, 70684,35342,17671,53014,26507,79522,39761,119284,59642,29821,89464,44732,22366,11183,33550,16775,50326,25163,75490,37745,113236,56618,28309,84928,42464, 21232,10616,5308,2654,1327,3982,1991,5974,2987,8962,4481,1344,6722,3361,10084,5042,2521,7564,3782,1891,5674,2837,8512,4256,2128, 1064,532,266,133,400,200,100,5 0,25,76,38,19,58,29,88,44,22,11,34,17,52,26,13,40,20,10,5,16,8,4,2,1 < / p>


使用你的确切递归想法,但使用dict(稀疏)而不是列表(没有probs运行):

lengths = {1: 1}

def collatz(i):
    if i in lengths: return lengths [i]
    j = (i * 3 + 1) if i % 2 else i // 2
    c = collatz (j) + 1
    lengths [i] = c
    return c

longest = 1
for i in range(1, 1000000):
    c = collatz(i)
    if c > longest:
        longest = c
        print(i, c)

答案 1 :(得分:1)

列表实现使用了大量的额外内存(正如Hyperboreus指出的那样,谢谢!)。切换到字典可以将内存使用量从大约212GB减少到大约70MB,并且可以在2秒内解决问题。

lim = 1000000
chain = {1:1} 
maxL = 1

def collatz(i):
   if i not in chain: chain[i] = 1 + collatz(i/2 if i%2==0 else 3*i+1)
   return chain[i]

for i in range(1, lim): maxL = i if (collatz(i) > chain[maxL]) else maxL 
print "collatz chain of {} is {} terms long.".format(maxL, collatz(maxL))