在python / pypy中优化暴力过程以找到一个孩子的数字

时间:2013-03-13 06:32:44

标签: python optimization multiprocessing brute-force pypy

蛮力方法并非旨在解决问题,而是有助于其研究。我正在研究一个Project Euler问题,让我发现所有数字从X到Y小于Y,只有一个“子串”可以被一个数字中的位数整除。

这些被称为独子数字。 104是一个孩子的号码。在其子串中,[1,0,4,10,04,104]只有0可被3整除。该问题要求找出少于10 * 17的一子数的数量。蛮力方法不是正确的方法;但是,我有一个理论要求我知道在10 * 11之前发生的一个孩子数量。

即使让我的笔记本电脑停留半天,我也没有成功找到这个号码。我试过Cython,我是一个对C一无所知的新手程序员。结果非常糟糕。我甚至尝试过云计算,但是我的ssh管道在进程完成之前总是会中断。

如果有人可以帮助我找出一些不同的方法或优化预先形成 BRUTE FORCE 这个问题的方法高达10 ** 11,非常感谢。

请不要......

借给我关于数论的建议或你对这个问题的答案,因为我已经花了很长时间研究它,我真的希望自己得出结论。

## a one child number has only one "substring" divisable by the
## number of digits in the number. Example: 104 is a one child number as 0
## is the only substring which 3 may divide, of the set [1,0,4,10,04,104]

## FYI one-child numbers are positive, so the number 0 is not one-child


from multiprocessing import Pool
import os.path

def OneChild(numRange): # hopefully(10*11,1)
    OneChild = []
    start = numRange[0]
    number = numRange[1]

    ## top loop handles one number at a time
    ## loop ends when start become larger then end
    while number >= start:

        ## preparing to analayze one number
        ## for exactly one divisableSubstrings
        numberString = str(start)
        numDigits = len(numberString)
        divisableSubstrings = 0
        ticker1,ticker2 = 0, numDigits

        ## ticker1 starts at 0 and ends at number of digits - 1
        ## ticker2 starts at number of digits and ends +1 from ticker1
        ## an example for a three digit number: (0,3) (0,2) (0,1) (1,3) (1,2) (2,3)
        while ticker1 <= numDigits+1:
            while ticker2 > ticker1:
                if int(numberString[ticker1:ticker2]) % numDigits == 0:
                    divisableSubstrings += 1
                    if divisableSubstrings == 2:
                        ticker1 = numDigits+1
                        ticker2 = ticker1

                ##Counters    
                ticker2 -= 1
            ticker1 += 1
            ticker2 = numDigits             
        if divisableSubstrings == 1: ## One-Child Bouncer 
            OneChild.append(start) ## inefficient but I want the specifics
        start += 1 
    return (OneChild)

## Speed seems improve with more pool arguments, labeled here as cores
## Im guessing this is due to pypy preforming best when task is neither
## to large nor small
def MultiProcList(numRange,start = 1,cores = 100): # multiprocessing
    print "Asked to use %i cores between %i numbers: From %s to %s" % (cores,numRange-start, start,numRange)
    cores = adjustCores(numRange,start,cores)
    print "Using %i cores" % (cores)

    chunk = (numRange+1-start)/cores
    end = chunk+start -1 
    total, argsList= 0, []
    for i in range(cores):
        # print start,end-1
        argsList.append((start,end-1))
        start, end = end , end + chunk
    pool = Pool(processes=cores)
    data = pool.map(OneChild,argsList)
    for d in data:
        total += len(d)
    print total

##    f = open("Result.txt", "w+")
##    f.write(str(total))
##    f.close()

def adjustCores(numRange,start,cores):
    if start == 1:
        start = 0
    else:
        pass
    while (numRange-start)%cores != 0:
        cores -= 1
    return cores

#MultiProcList(10**7)
from timeit import Timer
t = Timer(lambda: MultiProcList(10**6))
print t.timeit(number=1)

1 个答案:

答案 0 :(得分:1)

这是我最快的蛮力代码。它使用cython来加速计算。它不是检查所有数字,而是通过递归找到所有的一子数字。

%%cython
cdef int _one_child_number(int s, int child_count, int digits_count):
    cdef int start, count, c, child_count2, s2, part, i
    if s >= 10**(digits_count-1):
        return child_count
    else:
        if s == 0:
            start = 1
        else:
            start = 0
        count = 0
        for c in range(start, 10):
            s2 = s*10 + c
            child_count2 = child_count
            i = 10
            while True:
                part = s2 % i
                if part % digits_count == 0:
                    child_count2 += 1
                    if child_count2 > 1:
                        break
                if part == s2:
                    break
                i *= 10

            if child_count2 <= 1:
                count += _one_child_number(s2, child_count2, digits_count)
        return count 

def one_child_number(int digits_count):
    return _one_child_number(0, 0, digits_count)

要找到F(10 ** 7)的数量,得到结果277674大约需要100ms。

print sum(one_child_number(i) for i in xrange(8))

您需要64位整数来计算大结果。

编辑:我添加了一些评论,但我的英文不好,所以我将代码转换为纯python代码,并添加一些打印来帮助您弄清楚它是如何工作的。

_one_child_number递增地向左添加数字schild_counts中的子计数,digits_count是{{1}的最终数字}}

s

这是他输出def _one_child_number(s, child_count, digits_count): print s, child_count if s >= 10**(digits_count-1): # if the length of s is digits_count return child_count # child_count is 0 or 1 here, 1 means we found one one-child-number. else: if s == 0: start = 1 #if the length of s is 0, we choose from 123456789 for the most left digit. else: start = 0 #otherwise we choose from 0123456789 count = 0 # init the one-child-number count for c in range(start, 10): # loop for every digit s2 = s*10 + c # add digit c to the right of s # following code calculates the child count of s2 child_count2 = child_count i = 10 while True: part = s2 % i if part % digits_count == 0: child_count2 += 1 if child_count2 > 1: # when child count > 1, it's not a one-child-number, break break if part == s2: break i *= 10 # if the child count by far is less than or equal 1, # call _one_child_number recursively to add next digit. if child_count2 <= 1: count += _one_child_number(s2, child_count2, digits_count) return count ,并且一个3位数的数字是第一列是3位数字的第二列的总和。

_one_child_number(0, 0, 3)