鉴于是一个大阵列。我正在寻找索引,数组中的所有元素加起来都小于limit
。我找到了两种方法:
import time as tm
import numpy as nm
# Data that we are working with
large = nm.array([3] * 8000)
limit = 23996
# Numpy version, hoping it would be faster
start = tm.time() # Start timing
left1 = nm.tril([large] * len(large)) # Build triangular matrix
left2 = nm.sum(left1, 1) # Sum up all rows of the matrix
idx = nm.where(left2 >= limit)[0][0] # Check what row exceeds the limit
stop = tm.time()
print "Numpy result :", idx
print "Numpy took :", stop - start, " seconds"
# Python loop
sm = 0 # dynamic sum of elements
start = tm.time()
for i in range(len(large)):
sm += large[i] # sum up elements one by one
if sm >= limit: # check if the sum exceeds the limit
idx = i
break # If limit is reached, stop looping.
else:
idx = i
stop = tm.time()
print "Loop result :", idx
print "Loop took :", stop - start, " seconds"
不幸的是,如果数组大得多,numpy版本的内存耗尽。更大的意思是100 000个值。当然,这给出了一个很大的矩阵,但for循环需要2分钟。同样贯穿这10万个价值观。那么,瓶颈在哪里?如何加快此代码的速度?
答案 0 :(得分:3)
你可以通过以下方式获得:
np.argmin(large.cumsum() < limit)
或等效
(large.cumsum() < limit).argmin()
在IPython中:
In [6]: %timeit (large.cumsum() < limit).argmin()
10000 loops, best of 3: 33.8 µs per loop
用于包含100000个元素的large
和limit = 100000.0/2
In [4]: %timeit (large.cumsum() < limit).argmin()
1000 loops, best of 3: 444 µs per loop
它没有任何实际区别,但import numpy as np
而不是import numpy as nm
是常规的。
文档:
答案 1 :(得分:3)
使用numba可以显着加快python循环。
import numba
import numpy as np
def numpyloop(large,limit):
return np.argmin(large.cumsum() < limit)
@numba.autojit
def pythonloop(large,limit):
sm = 0
idx = 0
for i in range(len(large)):
#for i in range(large.shape[0]):
sm += large[i] # sum up elements one by one
if sm >= limit: # check if the sum exceeds the limit
idx = i
break # If limit is reached, stop looping.
else:
idx = i
return idx
large = np.array([3] * 8000)
limit = 23996
%timeit pythonloop(large,limit)
%timeit numpyloop(large,limit)
large = np.array([3] * 100000)
limit = 100000/2
%timeit pythonloop(large,limit)
%timeit numpyloop(large,limit)
Python:100循环,最佳3:6.63μs每循环
Numpy:10000次循环,最佳3次:33.2μs每次循环
大阵列,小限制
Python:100000循环,最佳3:12.1μs每循环
Numpy:1000次循环,最佳3次:351μs每次循环