代码如何工作

Question

对于给定的数字N，找到总可能的有序对（x，y），使得x和y小于或等于n，并且x的数字之和小于y的数字之和

例如n = 6：21可能的有序对[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)]

这里x总是小于y，x的数字之和也小于y的数字之和，x和y都等于或小于N.这是我天真的方法，但这很慢，工作正常直到N = 10000之后才表现不佳。

from itertools import permutations
n=100
lis=list(range(n+1))
y=list(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
(list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
print(len(y))

一个使用发电机

from itertools import permutations
for _ in range(int(input())):
    n=1000
    lis=range(n+1)
    y=(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
     (list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
    print (sum(1 for _ in y))

更好的改进版本：

from itertools import permutations
for _ in range(int(input())):
    n=1000
    lis=range(n+1)
    y=(i for i in permutations(lis,2) if i[0]<i[1] and sum(map(int,(str(i[0]))))<sum(map(int,(list(str(i[1]))))))
    print (sum(1 for _ in y))

有没有更好的方法来解决这个问题？

Answer 1

代码如何工作

这几乎完全是对您的方法的算法改进。使用生成器或列表推导可能会更快，但您必须对其进行分析以进行检查。该算法的工作原理如下：

预先计算1 - N的数字总和。
按数字总和将数字1 - N分组。我们有一个看起来像这样的对象。因此，如果我们想要获得数字和> 2的数字，我们只需要计算第三行之后的数字。

1：1,10 2：2,11,20 3：3,12,21,30 ......

观察每行中的数字是按排序顺序排列的。如果我们的数字是12，我们只需要查看12之后的数字。我们可以通过二分搜索找到每行中的12个。

总体而言，这比您的算法提高了约20倍，内存成本为O（N）

代码

import time
import bisect
import itertools

N = 6

def sum_digits(n):
    # stolen from here: https://stackoverflow.com/questions/14939953/sum-the-digits-of-a-number-python
    # there may be a faster way of doing this based on the fact that you're doing this over 1 .. N
    r = 0
    while n:
        r, n = r + n % 10, n // 10
    return r        

t = time.time()
# trick 1: precompute all of the digit sums. This cuts the time to ~0.3s on N = 1000
digit_sums = [sum_digits(i) for i in range(N+1)]
digit_sum_map = {}

# trick 2: group the numbers by the digit sum, so we can iterate over all the numbers with a given digit sum very quickly
for i, key in enumerate(digit_sums):
    try:
        digit_sum_map[key].append(i)
    except KeyError:
        digit_sum_map[key] = [i]
max_digit_sum = max(digit_sum_map.keys())

# trick 3: note that we insert elements into the digit_sum_map in order. thus we can binary search within the map to find
# where to start counting from. 
result = []
for i in range(N):
    for ds in range(digit_sums[i] + 1, max_digit_sum + 1):
        result.extend(zip(itertools.repeat(i), digit_sum_map[ds][bisect.bisect_left(digit_sum_map[ds], i):]))

print('took {} s, answer is {} for N = {}'.format(time.time() - t, len(result), N))
# took 0.0 s, answer is 21 for N = 6
# took 0.11658287048339844 s, answer is 348658 for N = 1000
# took 8.137377977371216 s, answer is 33289081 for N = 10000

# for reference, your last method takes 2.45 s on N = 1000 on my machine

Answer 2

def find_total_possible(n):
    x = [i for i in range(n + 1)]
    y = [i + 1 for i in range(n + 1)
    z = list(zip(x,y))
    return z

这是家庭作业吗？

闻起来像家庭作业。

Answer 3

一个问题是，您仍然会生成所有排列，然后删除x大于或等于y的条目。另一个问题是，您可以在每次迭代时重新计算y的数字总和，以便稍后进行存储和比较。可能有一个更优雅的解决方案，如果你知道所有未来的x条目都不符合标准，你可以基本上打破嵌套循环。

from itertools import permutations
from time import time

def time_profile(f):

    def time_func(*args, **kwargs):
        start_time = time()
        r = f(*args, **kwargs)
        end_time = time()
        print "{} Time: {}".format(f, end_time - start_time)
        return r

    return time_func

@time_profile
def calc1(n):
    lis=list(range(n+1))
    y=list(i for i in permutations(lis,2) if i[0]<i[1] and sum(list(map(int,
    (list(str(i[0]))))))<sum(list(map(int,(list(str(i[1])))))))
    return y

@time_profile
def calc2(n):
    l = []
    for y in xrange(n+1):
        y_sum = sum(map(int, str(y)))
        for x in xrange(y):
            # May be possible to use x_digits to break
            x_digits = map(int, str(x))
            if sum(x_digits) >= y_sum: continue
            l.append((x, y))

    return l

if __name__ == '__main__':
    N = 10000
    if len(calc1(N)) != len(calc2(N)): print 'fail'

＆LT;函数calc1 at 0xfff25cdc＆gt;时间：233.378999949

＆LT;函数calc2 at 0xfff2c09c＆gt;时间：84.9670000076

与问题无关的其他一些观点。您对列表的一些调用是多余的。 map函数已经返回一个列表。在Python 3中，range返回一个生成器，它在迭代它时返回一个值。它的内存效率更高，并且可以正常工作。

有序的一对数字

3 个答案:

代码如何工作

代码