递归与没有记忆

时间:2015-11-27 14:43:38

标签: python recursion memoization catalan

我在学校做作业用递归计算加泰罗尼亚语数字: 没有记忆的第一个

def catalan_rec(n):
res = 0
if n == 0:
    return 1
else:
    for i in range (n):
        res += (catalan_rec(i))*(catalan_rec(n-1-i))
    return res

第二名:

def catalan_mem(n, memo = None):
    if memo == None:
        memo = {0: 1}
    res = 0
    if n not in memo:
        for i in range (n):
            res += (catalan_mem(i))*(catalan_mem(n-1-i))
        memo[n] = res
    return memo[n]

最奇怪的事发生在我身上: 记忆花了两倍的时间!当它应该是相反的方式!

有人可以向我解释一下吗?

1 个答案:

答案 0 :(得分:2)

这个问题激发了我研究各种加泰罗尼亚数字算法和各种记忆方案的相对速度。下面的代码包含问题中给出的递归算法的函数,以及一个只需要一次递归调用的简单算法,这也很容易迭代实现。还有基于二项式系数的迭代版本。所有这些算法都在Catalan numbers上的维基百科文章中给出。

对于大多数记忆版本来说,获取准确的时间并不容易。通常在使用timeit模块时,会对要测试的函数执行多个循环,但由于缓存,这里没有给出真实的结果。要获得真实结果,需要清除缓存,虽然可能它有点混乱和缓慢,但是需要在时序过程之外完成缓存清除,以避免增加缓存清除的开销。实际加泰罗尼亚数字计算的时间。因此,此代码通过简单地计算一个大的加泰罗尼亚数字来生成时序信息,而没有循环。

除了定时代码,还有一个函数verify(),它验证所有加泰罗尼亚数字函数产生相同的结果,并且有一个函数可以打印字节码每个加泰罗尼亚数字功能。这两个功能都已被注释掉。请注意verify()填充缓存,因此在verify()之前调用time_test()会导致计时信息无效。

下面的代码是使用Python 2.6.6编写和测试的,但它也可以在Python 3.6.0上正确运行。

#!/usr/bin/env python

''' Catalan numbers

    Test speeds of various algorithms

    1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, 742900, ...

    See https://en.wikipedia.org/wiki/Catalan_number
    and http://stackoverflow.com/q/33959795/4014959

    Written by PM 2Ring 2015.11.28
'''

from __future__ import print_function, division

from timeit import Timer
import dis


#Use xrange if running on Python 2
try:
    range = xrange
except NameError:
    pass

def catalan_rec_plain(n):
    ''' no memoization. REALLY slow! Eg, 26 seconds for n=16 '''
    if n < 2:
        return 1

    res = 0
    for i in range(n):
        res += catalan_rec_plain(i) * catalan_rec_plain(n-1-i)
    return res


#Most recursive versions have recursion limit: n=998, except where noted
cache = {0: 1}
def catalan_rec_extern(n):
    ''' memoize with an external cache '''
    if n in cache:
        return cache[n]

    res = 0
    for i in range(n):
        res += catalan_rec_extern(i) * catalan_rec_extern(n-1-i)
    cache[n] = res
    return res


def catalan_rec_defarg(n, memo={0: 1}):
    ''' memoize with a default keyword arg cache '''
    if n in memo:
        return memo[n]

    res = 0
    for i in range(n):
        res += catalan_rec_defarg(i) * catalan_rec_defarg(n-1-i)
    memo[n] = res
    return res


def catalan_rec_funcattr(n):
    ''' memoize with a function attribute cache '''
    memo = catalan_rec_funcattr.memo
    if n in memo:
        return memo[n]

    res = 0
    for i in range(n):
        res += catalan_rec_funcattr(i) * catalan_rec_funcattr(n-1-i)
    memo[n] = res
    return res

catalan_rec_funcattr.memo = {0: 1}


def make_catalan():
    memo = {0: 1}
    def catalan0(n):
        ''' memoize with a simple closure to hold the cache '''
        if n in memo:
            return memo[n]

        res = 0
        for i in range(n):
            res += catalan0(i) * catalan0(n-1-i)
        memo[n] = res
        return res
    return catalan0

catalan_rec_closure = make_catalan()
catalan_rec_closure.__name__ = 'catalan_rec_closure'


#Simple memoization, with initialised cache
def initialise(memo={}):    
    def memoize(f):
        def memf(x):
            if x in memo:
                return memo[x]
            else:
                res = memo[x] = f(x)
                return res
        memf.__name__ = f.__name__
        memf.__doc__ = f.__doc__
        return memf
    return memoize

#maximum recursion depth exceeded at n=499
@initialise({0: 1})
def catalan_rec_decorator(n):
    ''' memoize with a decorator closure to hold the cache '''
    res = 0
    for i in range(n):
        res += catalan_rec_decorator(i) * catalan_rec_decorator(n-1-i)
    return res

# ---------------------------------------------------------------------

#Product formula
# C_n+1 = C_n * 2 * (2*n + 1) / (n + 2)
# C_n = C_n-1 * 2 * (2*n - 1) / (n + 1)

#maximum recursion depth exceeded at n=999
def catalan_rec_prod(n):
    ''' recursive, using product formula '''
    if n < 2:
        return 1
    return (4*n - 2) * catalan_rec_prod(n-1) // (n + 1)

#Note that memoizing here gives no benefit when calculating a single value
def catalan_rec_prod_memo(n, memo={0: 1}):
    ''' recursive, using product formula, with a default keyword arg cache '''
    if n in memo:
        return memo[n]
    memo[n] = (4*n - 2) * catalan_rec_prod_memo(n-1) // (n + 1)
    return memo[n]


def catalan_iter_prod0(n):
    ''' iterative, using product formula '''
    p = 1
    for i in range(3, n + 2):
        p *= 4*i - 6 
        p //= i 
    return p


def catalan_iter_prod1(n):
    ''' iterative, using product formula, with incremented m '''
    p = 1
    m = 6
    for i in range(3, n + 2):
        p *= m
        m += 4 
        p //= i 
    return p

#Add memoization to catalan_iter_prod1
@initialise({0: 1})
def catalan_iter_memo(n):
    ''' iterative, using product formula, with incremented m and memoization '''
    p = 1
    m = 6
    for i in range(3, n + 2):
        p *= m
        m += 4 
        p //= i 
    return p

def catalan_iter_prod2(n):
    ''' iterative, using product formula, with zip '''
    p = 1
    for i, m in zip(range(3, n + 2), range(6, 4*n + 2, 4)):
        p *= m
        p //= i 
    return p


def catalan_iter_binom(n):
    ''' iterative, using binomial coefficient '''
    m = 2 * n
    n += 1
    p = 1
    for i in range(1, n):
        p *= m
        p //= i
        m -= 1
    return p // n


#All the functions, in approximate speed order
funcs = (
    catalan_iter_prod1,
    catalan_iter_memo,
    catalan_iter_prod0,
    catalan_iter_binom,
    catalan_iter_prod2,

    catalan_rec_prod,
    catalan_rec_prod_memo,
    catalan_rec_defarg,
    catalan_rec_closure,
    catalan_rec_extern,
    catalan_rec_decorator,
    catalan_rec_funcattr,
    #catalan_rec_plain,
)

# ---------------------------------------------------------------------

def show_bytecode():
    for func in funcs:
        fname = func.__name__
        print('\n%s' % fname)
        dis.dis(func)

#Check that all functions give the same results
def verify(n):
    range_n = range(n)
    #range_n = [n]
    func = funcs[0]
    table = [func(i) for i in range_n]
    #print(table)
    for func in funcs[1:]:
        print(func.__name__, [func(i) for i in range_n] == table)

def time_test(n):
    ''' Print timing stats for all the functions '''
    res = []
    for func in funcs:
        fname = func.__name__
        print('\n%s: %s' % (fname, func.__doc__))
        setup = 'from __main__ import cache, ' + fname
        cmd = '%s(%d)' % (fname, n)
        t = Timer(cmd, setup)
        r = t.timeit(1)
        print(r)
        res.append((r, fname))

    ##Sort results from fast to slow
    #print()
    #res.sort()
    #for t, fname in res:
        #print('%s:\t%s' % (fname, t))
        ##print('%s,' % fname)


#show_bytecode()

#verify(50)
#verify(997)

time_test(450)

#for i in range(20):
    #print('%2d: %d' % (i, catalan_iter_binom(i)))

典型结果

catalan_iter_prod1:  iterative, using product formula, with incremented m 
0.00119090080261

catalan_iter_memo:  iterative, using product formula, with incremented m and memoization 
0.001140832901

catalan_iter_prod0:  iterative, using product formula 
0.00202202796936

catalan_iter_binom:  iterative, using binomial coefficient 
0.00141906738281

catalan_iter_prod2:  iterative, using product formula, with zip 
0.00123286247253

catalan_rec_prod:  recursive, using product formula 
0.00263595581055

catalan_rec_prod_memo:  recursive, using product formula, with a default keyword arg cache 
0.00210690498352

catalan_rec_defarg:  memoize with a default keyword arg cache 
0.46977186203

catalan_rec_closure:  memoize with a simple closure to hold the cache 
0.474807024002

catalan_rec_extern:  memoize with an external cache 
0.47812795639

catalan_rec_decorator:  memoize with a decorator closure to hold the cache 
0.47876906395

catalan_rec_funcattr:  memoize with a function attribute cache 
0.516775131226

以上结果由2GHz Pentium 4产生,系统负载最小。但是,从运行到运行,存在相当多的差异,尤其是使用更快的算法。

正如您所看到的,对于问题中使用的双递归算法,使用缓存的默认参数实际上是一种非常好的方法。所以递归版的清理版本是:

def catalan_rec(n, memo={0: 1}):
    ''' recursive Catalan numbers, with memoization '''
    if n in memo:
        return memo[n]

    res = 0
    for i in range(n):
        res += catalan_rec_defarg(i) * catalan_rec_defarg(n-1-i)
    memo[n] = res
    return res

但是,使用其中一种迭代算法(例如catalan_iter_prod1)会更加高效。 。如果您打算多次调用该函数且重复参数的可能性很高,那么请使用memoized版本catalan_iter_memo

总之,我应该提到避免递归是最好的,除非它适用于问题域(例如,当处理像树这样的递归数据结构时)。 Python无法执行tail call elimination并且它强加了递归限制。因此,如果有迭代算法,它几乎总是比递归算法更好的选择。当然,如果您正在学习递归,而您的老师希望您编写递归代码,那么您就没有多少选择。 :)