如何在python中优化这个?

时间:2017-11-12 14:56:12

标签: python algorithm python-3.x time-complexity

我想找到大量的对数。如果我给出数字n并要求确定,那么对的数量

  • S(x) < S(y) where S(k) denotes the sum of digits of integer k.
  • 0 <= x < y <= n

和constratints是i <= n <= 10^250

例如,让我们说数字为3,因此有效对将是(0,1),(0,2),(0,3),(1,2),( 1,3)和(2,3)因此它计为6.因此答案。为此我写了代码:

#!/bin/python3

import sys
from itertools import permutations

def sumofelement(n):
    sum = 0
    while(n>0):
        temp = n%10
        sum = sum + temp
        n = n//10
    return sum

def validpair(x):
    x, y = x
    if sumofelement(x) < sumofelement(y):
        return True

def countPairs(n):
    z = [x for x in range(n+1)]
    permuation = permutations(z, 2)
    count = 0
    for i in permuation:
        print(i, validpair(i))
        if validpair(i):
            count += 1
    return count%(1000000007)

if __name__ == "__main__":
    n = int(input())
    result = countPairs(n)
    print(result)

但是当数量变化很大时会出现问题,比方说10 ^ 250。我如何优化,我试图搜索但无法找到任何有效的解决方案。

4 个答案:

答案 0 :(得分:1)

注意:此答案不会考虑稍后添加的约束(x<y)。并且不接受像10^250这样的任何巨大输入。建议根据要求改进OP的代码。

似乎没有必要实际生成对。这意味着不存储和操纵像(1000, 900)这样的元素,而是直接存储和操作它们的数字之和:(1,9)

因此,您可以对现有功能进行此修改:

def countPairs(n):
    z = [sumofelement(x) for x in range(n+1)]
    p = permutations(z, 2)
    count = 0
    for x,y in p:
        if (x<y):
            count += 1
    return count%(1000000007)
n = 2K的

测试

> time python3 test.py #old
1891992

real    0m15.967s
user    0m15.876s
sys     0m0.049s
> time python3 test2.py #new
1891992

real    0m0.767s
user    0m0.739s
sys     0m0.022s

对于n = 5K

11838575

real    1m32.159s
user    1m30.381s
sys     0m0.444s

11838575

real    0m4.280s
user    0m4.258s
sys     0m0.012s

虽然速度提高了95%,但似乎还是O(n ^ 2)

所以这是一种不同的方法:

from collections import Counter

def sum_digits(n):
    s = 0
    while n:
        s += n % 10
        n //= 10
    return s

def count_pairs(n):
    z = [sum_digits(x) for x in range(n+1)]
    c = Counter(z)
    final = sorted(c.items(), reverse=True)
    print(final)

    count = 0
    older = 0
    for k,v in final:
        count += older * v
        older += v
    return count

if __name__ == "__main__":
    n = int(input())
    print(count_pairs(n))

我们创建了一个字典{ sum_of_digits: occurences }和我们 把它改成列表。对于n=10的考试,这将是

[(9, 1), (8, 1), (7, 1), (6, 1), (5, 1), (4, 1), (3, 1), (2, 1), (1, 2), (0, 1)]

当我们经历它时,在任何节点处,出现次数乘以前一节点的总和是具有该总和数的任何数字对总计数的贡献。它可能是O(n)。与我们的实际数据相比,计数器尺寸很小。

<=> 测试,N = 2K

[(28, 1), (27, 4), (26, 9), (25, 16), (24, 25), (23, 36), (22, 49), (21, 64), (20, 81), (19, 100), (18, 118), (17, 132), (16, 142), (15, 148), (14, 150), (13, 148), (12, 142), (11, 132), (10, 118), (9, 100), (8, 81), (7, 64), (6, 49), (5, 36), (4, 25), (3, 16), (2, 10), (1, 4), (0, 1)]
1891992

real    0m0.074s
user    0m0.060s
sys     0m0.014s

,N = 67535

[(41, 1), (40, 5), (39, 16), (38, 39), (37, 80), (36, 146), (35, 245), (34, 384), (33, 570), (32, 809), (31, 1103), (30, 1449), (29, 1839), (28, 2259), (27, 2692), (26, 3117), (25, 3510), (24, 3851), (23, 4119), (22, 4296), (21, 4370), (20, 4336), (19, 4198), (18, 3965), (17, 3652), (16, 3281), (15, 2873), (14, 2449), (13, 2030), (12, 1634), (11, 1275), (10, 962), (9, 700), (8, 490), (7, 329), (6, 210), (5, 126), (4, 70), (3, 35), (2, 15), (1, 5), (0, 1)]
2174358217

real    0m0.278s
user    0m0.264s
sys     0m0.014s

答案 1 :(得分:0)

您的预期输出是有效对的数量,而不是所有有效对的列表。您可以通过简单的组合计算这个数字,而不需要检查所有可能性。

对于n=3,对的数量将是n=2的对数+格式为(x,3)的对数。 x可以在<0,n-1>范围内,并且包含n个元素。

代码可以使用递归或循环或公式,所有这些代码都应该是计算机相同的数字,公式显然是最快的。

def countPairs(n):
    if n == 1:
        return 1 # pair (0,1)
    return countPairs(n-1) + n

def countPairs(n):
    ret = 0
    for x in xrange(1,n):
        ret+=x
    return ret

def countPairs(n):
    return n*(n-1)/2

答案 2 :(得分:0)

让我们 - 暂时 - 忽略第二个约束x&lt;收率

然后,一种策略是将所有数字汇总在一起,并使用相同的数字总和。例如,如果你的 n 10^250,那么就是s.o.d. 1000将完全1983017386182586348981409609496904425087072829264027113757253089553808835917213113938681936684409793628116446312878275672807319027255318921532869897133614537254127444640190990014430819413778762187558412950294708704055125267388491053875845430856889次发生,而s.o.d. 10 313933899756954150622536381330141298432969991820256911356499404510841221727177771404498196898219200726905190334516036530761185604472351714146659153338691825580165670694714765688631611013643183449188160088364429780094383087473530152672586062700335444189441183499432858425871184639350次。所以这两个s.o.d.s一起会产生这些数字的乘积,即nonsod对。

或稍微少一些狂妄自大的数字:n = 88,s.o.d.1 = 10,s.o.d.2 = 7我们得到8和8,因此有64对。

下面的代码使用逐位数字的简单递归关系实现此策略(函数10^250)。由于存在大量冗余分支,因此使用缓存。

n = 49689518924223997098471223543364330459595831386684873270186194285660874002514005047966357557084650317768560146609913273315351520002512374912739761203458271777707529815027881619901050952541693486379889157466211100495006800815751752605470841565728511141845695222712435837491694221722360852940495211481721723206152092725455942611410225513504242173241811867522974465909681478041570056834016566434386955417360661126555266582980778790541324964301380703686112669669641207272764740986099727604245250714092580(仍未强制执行x&lt; y)的完整计数为25984328769282898156215987070093760297836281753626742070663593024918781683928674045441700800803359016753562494186043552665812224996953995704125243157891603184533274543105499314528302202972742702392476556566583829840036706378670333595223855845665062500914398291514442277659839377773164451943550566697849130769244805996419427202677753063819693113304304818586290078490380143872959635951851910822582661516954316275598690668540412688085631222123413887008350968291853549698946413333342843654709903250347001,只需几秒钟即可完成计算。

现在让我们回到约束二:x&lt; ÿ

我们可以通过分别调整x和y的最左边数字来使用无约束代码。如果它们是相同的,我们可以将它们切掉并使用复发。否则,属于x的那个必须更小。切碎后我们基本上都回到了一个约束问题。只有一个额外的恩典&#39;参数是必需的。例如,如果x的第一个数字小于y的第一个数字,那么x的剩余s.o.d.s可能比y的最大数字大两个。

这个算法给出了预期的结果67535,而且10 ^ 250仍然是可行的(在我相当适中的笔记本电脑上2分钟)。结果:import itertools as it _cache_a = {} _cache_b = {} _max_k = 300*9 + 1 # good for up to 300 digits def maxsod(n): # find largest posssible sum of digits return (len(n) - 1) * 9 + int(n[0]) + all(d == '9' for d in n[1:]) def nonsod_str(n, k): # first anchor the recurrence and deal with some special cases if k < 0: return 0 elif k == 0: return 1 elif n == '0': return 0 elif len(n) == 1: return int(k <= int(n)) max_k_n = maxsod(n) if k >= max_k_n: return 0 max_k_n = min(_max_k, max_k_n) _cache_n = _cache_a.setdefault(int(n), max_k_n * [-1]) if _cache_n[k] < 0: # a miss # remove leftmost digit and any zeros directly following lead = int(n[0]) for j, d in enumerate(n[1:], 1): if d != '0': break next_n = n[j:] nines = (len(n) - 1) * '9' _cache_n[k] = sum(nonsod_str(nines, k-j) for j in range(lead)) \ + nonsod_str(next_n, k-lead) return _cache_n[k] def nonsod(n, k): "number of numbers between 0 and n incl whose sum of digits equals k" assert k < _max_k return nonsod_str(str(n), k) def count(n): sods = [nonsod(n, k) for k in range(maxsod(str(n)))] sum_ = sum(sods) return (sum_*sum_ - sum(s*s for s in sods)) // 2 def mixed(n, m, grace): nsods = [nonsod(n, k) for k in range(maxsod(str(n)))] msods = ([nonsod(m, k) for k in range(maxsod(str(m)))] if n != m else nsods.copy()) ps = it.accumulate(msods) if len(msods)-grace < len(nsods): delta = len(nsods) - len(msods) + grace nsods[-1-delta:] = [sum(nsods[-1-delta:])] return sum(x*y for x, y in zip(it.islice(ps, grace, None), nsods)) def two_constr(n): if (n<10): return (n * (n+1)) // 2 if not n in _cache_b: n_str = str(n) lead = int(n_str[0]) next_n = int(n_str[1:]) nines = 10**(len(n_str)-1) - 1 # first digit equal fde = two_constr(next_n) + lead * two_constr(nines) # first digit different, larger one at max fddlm = sum(mixed(next_n, nines, grace) for grace in range(lead)) # first digit different, both below max fddbbm = sum((lead-1-grace) * mixed(nines, nines, grace) for grace in range(lead-1)) _cache_b[n] = fde + fddlm + fddbbm return _cache_b[n]

SELECT ID, NAME,
       (SELECT Sum(t.Value)
        FROM TRANSACTION t
        WHERE e.Employee.ID = t.employee_ID
       ) as Total
FROM Employee e;

答案 3 :(得分:0)

希望这个(非优化的)算法应该与定义的新公式相匹配:

def digital_sum(n):
    return sum(int(c) for c in str(n))

def count(n):
    return sum(digital_sum(x) < digital_sum(y) for x in range(n) for y in range(x+1, n+1))

for n in range(1, 20):
    print(count(n), end=",")

打印:

1,3,6,10,15,21,28,36,45,46,49,54,61,70,81,94,109,126,145,

OEIS确实不知道。

原帖:

根据其输出判断,您的函数似乎与this OEIS sequence匹配。简洁的实现应该是:

def count(n):
    return sum(9 * i // 10 for i in range(n + 1))

请注意,10^250的速度仍然很慢。只是我的两分钱。