Question

鉴于是一个大阵列。我正在寻找索引，数组中的所有元素加起来都小于limit。我找到了两种方法：

import time as tm
import numpy as nm

# Data that we are working with
large = nm.array([3] * 8000)
limit = 23996

# Numpy version, hoping it would be faster
start = tm.time()  # Start timing
left1 = nm.tril([large] * len(large))  # Build triangular matrix
left2 = nm.sum(left1, 1)  # Sum up all rows of the matrix
idx = nm.where(left2 >= limit)[0][0]  # Check what row exceeds the limit
stop = tm.time()
print "Numpy result :", idx
print "Numpy took :", stop - start, " seconds"

# Python loop
sm = 0  # dynamic sum of elements
start = tm.time()
for i in range(len(large)):
    sm += large[i]  # sum up elements one by one
    if sm >= limit:  # check if the sum exceeds the limit
        idx = i
        break  # If limit is reached, stop looping.
    else:
        idx = i
stop = tm.time()
print "Loop result :", idx
print "Loop took :", stop - start, " seconds"

不幸的是，如果数组大得多，numpy版本的内存耗尽。更大的意思是100 000个值。当然，这给出了一个很大的矩阵，但for循环需要2分钟。同样贯穿这10万个价值观。那么，瓶颈在哪里？如何加快此代码的速度？

Answer 1

你可以通过以下方式获得：

np.argmin(large.cumsum() < limit)

或等效

(large.cumsum() < limit).argmin()

在IPython中：

In [6]: %timeit (large.cumsum() < limit).argmin()
10000 loops, best of 3: 33.8 µs per loop

用于包含100000个元素的large和limit = 100000.0/2

In [4]: %timeit (large.cumsum() < limit).argmin()
1000 loops, best of 3: 444 µs per loop

它没有任何实际区别，但import numpy as np而不是import numpy as nm是常规的。

文档：

Answer 2

使用numba可以显着加快python循环。

import numba
import numpy as np

def numpyloop(large,limit):
    return np.argmin(large.cumsum() < limit)

@numba.autojit
def pythonloop(large,limit):
    sm = 0
    idx = 0
    for i in range(len(large)):
    #for i in range(large.shape[0]):
        sm +=  large[i]  # sum up elements one by one
        if sm >= limit:  # check if the sum exceeds the limit
            idx = i
            break  # If limit is reached, stop looping.
        else:
            idx = i
    return idx

large = np.array([3] * 8000)
limit = 23996  

%timeit pythonloop(large,limit)
%timeit numpyloop(large,limit)

large = np.array([3] * 100000)
limit = 100000/2

%timeit pythonloop(large,limit)
%timeit numpyloop(large,limit)

Python：100循环，最佳3：6.63μs每循环
Numpy：10000次循环，最佳3次：33.2μs每次循环

大阵列，小限制
Python：100000循环，最佳3：12.1μs每循环
Numpy：1000次循环，最佳3次：351μs每次循环

查找数组中的索引，快速地将所有元素的总和小于限制

2 个答案: