我正在尝试使用python生成一个数字序列列表,如下所示。
[0,0,0,0] [0,0,0,1] [0,0,0,2] [0,0,1,1] [0,0,1,2] [0,0,2,2] [0,1,1,1]
[0,1,1,2] [0,1,2,2] [0,2,2,2] [1,1,1,1] [1,1,1,2] ... [2,2,2,2]
现在,我可以使用带有递归调用的纯python执行此操作,但是单次运行(几个小时)需要花费大量时间。我想知道是否有可能用numpy做这个并节省大量时间,如果是的话,怎么样?
答案 0 :(得分:2)
你的意思是这个(或你的意思是如何定义的?)
from itertools import product
for item in product((0, 1, 2), repeat=4):
print(item)
打印:
(0, 0, 0, 0)
(0, 0, 0, 1)
(0, 0, 0, 2)
(0, 0, 1, 0)
(0, 0, 1, 1)
(0, 0, 1, 2)
...
(2, 2, 1, 2)
(2, 2, 2, 0)
(2, 2, 2, 1)
(2, 2, 2, 2)
不确定这是否是您要找的,但product
包含在python中。
这应该是快速且节省内存的。列表是按需创建的。
......第二个想法:这可能是你的意思,对吗?
for a, b, c, d in product((0, 1, 2), repeat=4):
if not a <= b <= c <= d:
continue
print(a,b,c,d)
带输出:
0 0 0 0, 0 0 0 1, 0 0 0 2, 0 0 1 1, 0 0 1 2, 0 0 2 2, 0 1 1 1,
0 1 1 2, 0 1 2 2, 0 2 2 2, 1 1 1 1, 1 1 1 2, 1 1 2 2, 1 2 2 2,
2 2 2 2,
现在我知道你希望如何提高效率......
看起来Praveen's answer就是这样。
答案 1 :(得分:2)
您正在寻找的是itertools.combinations_with_replacement
。来自文档:
combinations_with_replacement('ABCD', 2) AA AB AC AD BB BC BD CC CD DD
因此:
>>> import itertools as it
>>> list(it.combinations_with_replacement((0, 1, 2), 4))
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2),
(0, 0, 1, 1), (0, 0, 1, 2), (0, 0, 2, 2),
(0, 1, 1, 1), (0, 1, 1, 2), (0, 1, 2, 2),
(0, 2, 2, 2), (1, 1, 1, 1), (1, 1, 1, 2),
(1, 1, 2, 2), (1, 2, 2, 2), (2, 2, 2, 2)]
这种方法的最好的部分是,因为它返回一个生成器,你可以迭代它而不存储它。这是一个巨大的优势,因为它会为你节省很多的内存。
这是一些更多的实现,包括一个numpy实现。 combinations_with_replacement
(try2
函数)似乎是最快的:
import itertools as it
import timeit
import numpy as np
def try1(n, m):
return [t for t in it.product(range(n), repeat=m) if all(a <= b for a, b in zip(t[:-1], t[1:]))]
def try2(n, m):
return list(it.combinations_with_replacement(range(n), m))
def try3(n, m):
a = np.mgrid[(slice(0, n),) * m] # All points in a 3D grid within the given ranges
a = np.rollaxis(a, 0, m + 1) # Make the 0th axis into the last axis
a = a.reshape((-1, m)) # Now you can safely reshape while preserving order
return a[np.all(a[:, :-1] <= a[:, 1:], axis=1)]
>>> %timeit b = try1(3, 4)
10000 loops, best of 3: 78.1 µs per loop
>>> %timeit b = try2(3, 4)
The slowest run took 8.04 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.66 µs per loop
>>> %timeit b = try3(3, 4)
10000 loops, best of 3: 97.8 µs per loop
即使对于更大的数字也是如此:
>>> %timeit b = try1(3, 6)
1000 loops, best of 3: 654 µs per loop
>>> %timeit b = try2(3, 6)
The slowest run took 7.20 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.33 µs per loop
>>> %timeit b = try3(3, 6)
10000 loops, best of 3: 166 µs per loop
备注:强>
try1
。try3
。