Question

我正在尝试使用python生成一个数字序列列表，如下所示。

[0,0,0,0] [0,0,0,1] [0,0,0,2] [0,0,1,1] [0,0,1,2] [0,0,2,2] [0,1,1,1]
[0,1,1,2] [0,1,2,2] [0,2,2,2] [1,1,1,1] [1,1,1,2] ... [2,2,2,2]

现在，我可以使用带有递归调用的纯python执行此操作，但是单次运行（几个小时）需要花费大量时间。我想知道是否有可能用numpy做这个并节省大量时间，如果是的话，怎么样？

Answer 1

你的意思是这个（或你的意思是如何定义的？）

from itertools import product

for item in product((0, 1, 2), repeat=4):
    print(item)

打印：

(0, 0, 0, 0)
(0, 0, 0, 1)
(0, 0, 0, 2)
(0, 0, 1, 0)
(0, 0, 1, 1)
(0, 0, 1, 2)
...
(2, 2, 1, 2)
(2, 2, 2, 0)
(2, 2, 2, 1)
(2, 2, 2, 2)

不确定这是否是您要找的，但product包含在python中。

这应该是快速且节省内存的。列表是按需创建的。

......第二个想法：这可能是你的意思，对吗？

for a, b, c, d in product((0, 1, 2), repeat=4):
    if not a <= b <= c <= d:
        continue
    print(a,b,c,d)

带输出：

0 0 0 0, 0 0 0 1, 0 0 0 2, 0 0 1 1, 0 0 1 2, 0 0 2 2, 0 1 1 1, 
0 1 1 2, 0 1 2 2, 0 2 2 2, 1 1 1 1, 1 1 1 2, 1 1 2 2, 1 2 2 2, 
2 2 2 2,

现在我知道你希望如何提高效率......

看起来Praveen's answer就是这样。

Answer 2

您正在寻找的是itertools.combinations_with_replacement。来自文档：

combinations_with_replacement('ABCD', 2) AA AB AC AD BB BC BD CC CD DD

因此：

>>> import itertools as it
>>> list(it.combinations_with_replacement((0, 1, 2), 4))
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2),
 (0, 0, 1, 1), (0, 0, 1, 2), (0, 0, 2, 2),
 (0, 1, 1, 1), (0, 1, 1, 2), (0, 1, 2, 2),
 (0, 2, 2, 2), (1, 1, 1, 1), (1, 1, 1, 2),
 (1, 1, 2, 2), (1, 2, 2, 2), (2, 2, 2, 2)]

这种方法的最好的部分是，因为它返回一个生成器，你可以迭代它而不存储它。这是一个巨大的优势，因为它会为你节省很多的内存。

其他实施和时间

这是一些更多的实现，包括一个numpy实现。 combinations_with_replacement（try2函数）似乎是最快的：

import itertools as it
import timeit

import numpy as np

def try1(n, m):
    return [t for t in it.product(range(n), repeat=m) if all(a <= b for a, b in zip(t[:-1], t[1:]))]

def try2(n, m):
    return list(it.combinations_with_replacement(range(n), m))

def try3(n, m):
    a = np.mgrid[(slice(0, n),) * m] # All points in a 3D grid within the given ranges
    a = np.rollaxis(a, 0, m + 1)     # Make the 0th axis into the last axis
    a = a.reshape((-1, m))           # Now you can safely reshape while preserving order
    return a[np.all(a[:, :-1] <= a[:, 1:], axis=1)]

>>> %timeit b = try1(3, 4)
10000 loops, best of 3: 78.1 µs per loop
>>> %timeit b = try2(3, 4)
The slowest run took 8.04 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.66 µs per loop
>>> %timeit b = try3(3, 4)
10000 loops, best of 3: 97.8 µs per loop

即使对于更大的数字也是如此：

>>> %timeit b = try1(3, 6)
1000 loops, best of 3: 654 µs per loop
>>> %timeit b = try2(3, 6)
The slowest run took 7.20 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.33 µs per loop
>>> %timeit b = try3(3, 6)
10000 loops, best of 3: 166 µs per loop

备注：

我使用的是python3

我使用this answer来实施try1。

我使用this answer来实施try3。

如何使用numpy

2 个答案:

其他实施和时间