Question

我有几个for循环，最里面的循环会被执行很多次。这个最里面的循环包含一些使用numpy的繁重计算，所以所有这些都需要花费很多时间。所以我试图优化最里面的循环。

最内层循环包含以下逻辑：

我有两个numpy-arrays（在现实生活中要大得多）：

left = np.asarray([0.4, 0.2, 0.2, 0.7, 0.6, 0.2, 0.3])
right= np.asarray([0.2, 0.7, 0.3, 0.2, 0.1, 0.9, 0.7])

将这些与阈值进行比较，以确定我是向左还是向右。如果left[x] > 0.55 and right[x] < 0.45我想要离开。如果left[x] < 0.55 and right[x] > 0.45我想要正确的话。我已经通过创建两个布尔数组来解决这个问题，一个用于左边，一个用于右边，根据：

leftListBool = ((left > 0.55)*1 + (right < 0.45)*1 - 1) > 0
rightListBool = ((right > 0.55)*1 + (left < 0.45)*1 - 1) > 0

上面的例子给了我：

leftListBool = [False False False  True  True False False]
rightListBool = [False  True False False False  True  True]

但是，如果我最后一次离开，我不能离开（对于右边也是如此）。因此，我根据以下内容循环这些列表：

wentLeft = False
wentRight = False
a = 0
for idx, v in enumerate(leftListBool):
    if leftListBool[idx] and not wentRight:
        a += DoAThing(idx)
        wentLeft = False
        wentRight = True
    elif rightListBool[idx] and not wentLeft:
        a += DoAnotherThing(idx)
        wentLeft = True
        wentRight = False

DoAThing()和DoAnotherThing()只是从numpy-array中获取值。

就优化而言，这是我的目标（之前情况更糟）。请注意，我需要按正确的顺序执行DoAThing()和DoAnotherThing()，因为它们取决于之前的值。

我尝试了什么？

我的第一个想法是创建一个leftListbool和rightListBool的统一列表，它看起来像（左= 1和右= -1）：

unified = [0 1 0 -1 -1 1 1]

但我坚持以比以下更优化的方式做到这一点：

buyListBool.astype(int)-sellListBool.astype(int)

但即使我实现了这一点，我也只需要包含第一个值，例如我有两个1相互跟随，这将导致：

unified = [0 1 0 -1 0 1 0]

在这种情况下，我可以将for循环减少为：

for i in unified:
    if i == 1:
        a += DoAThing(a)
    elif i == -1:
        a += DoAnotherThing(a)

但是即使这个for-loop也可以使用一些我尚未想到的numpy-magic进行优化。

完整的可运行代码：

start = time.time()

topLimit = 0.55
bottomLimit = 0.45

for outI in range(200):
    for midI in range(200):
        topLimit = 0.55
        bottomLimit = 0.45
        res = np.random.rand(200,3)
        left = res[:,0]        
        right = res[:,1]
        valList = res[:,2]

        #These two statements can probably be optimized 
        leftListBool = ((left > topLimit)*1 + (right < bottomLimit)*1 - 1) > 0
        rightListBool = ((right > topLimit)*1 + (left < bottomLimit)*1 - 1) > 0

        wentLeft = False
        wentRight = False
        a=0
        #Hopefully this loop can be optimized
        for idx, v in enumerate(leftListBool):
            if leftListBool[idx] and not wentRight:
                a += valList[idx]
                wentLeft = False
                wentRight = True
            elif rightListBool[idx] and not wentLeft:
                a += valList[idx]
                wentLeft = True
                wentRight = False

end = time.time()
print(end - start)

Answer 1

如果你需要循环你的序列而你关心性能，你不应该使用numpy.array。当NumPy可以执行循环时，NumPy数组非常棒，但是如果你必须自己循环它会很慢（我在最近的另一个答案中详细说明为什么迭代数组的速度相当缓慢，如果你想看一下：{ {3}}）。

您可以简单地使用tolist和zip来避免迭代的numpy-array开销：

import time
import numpy as np

start = time.time()

topLimit = 0.55
bottomLimit = 0.45

for outI in range(200):
    for midI in range(200):
        topLimit = 0.55
        bottomLimit = 0.45
        res = np.random.rand(200,2)
        left = res[:,0].tolist()      # tolist!
        right = res[:,1].tolist()     # tolist!

        wentLeft = False
        wentRight = False
        a=0

        for leftitem, rightitem in zip(left, right):
            if leftitem > topLimit and rightitem < bottomLimit and not wentRight:
                wentLeft, wentRight = False, True
            elif rightitem > topLimit and leftitem < bottomLimit and not wentLeft:
                wentLeft, wentRight = True, False

end = time.time()
print(end - start)

这使我的计算机的运行时间减少了30％。

您也可以稍后进行tolist转换（可能会更快或更快）：

start = time.time()

topLimit = 0.55
bottomLimit = 0.45

for outI in range(200):
    for midI in range(200):
        topLimit = 0.55
        bottomLimit = 0.45
        res = np.random.rand(200,2)
        left = res[:,0]     
        right = res[:,1]

        # use tolist after the comparisons
        leftListBool = ((left > topLimit) & (right < bottomLimit)).tolist()
        rightListBool = ((right > topLimit) & (left < bottomLimit)).tolist()

        wentLeft = False
        wentRight = False
        a=0
        #Hopefully this loop can be optimized
        for idx in range(len(leftListBool)):  # avoid direct iteration over an array
            if leftListBool[idx] and not wentRight:
                #a += DoAThing(a)
                wentLeft = False
                wentRight = True
            elif rightListBool[idx] and not wentLeft:
                #a += DoAnotherThing(a)
                wentLeft = True
                wentRight = False

end = time.time()
print(end - start)

这与其他方法一样快，但当left和right比200个元素大得多时，它可能会更快。

然而，这只是基于算法而不了解DoAThing和DoAnotherThing。您可以以允许向量化操作的方式构建它们（可以在不使用list的情况下将其加速一个数量级）。但这更加困难，我不知道这些功能在做什么。

Answer 2

根据更新后的问题，我将介绍一种对代码进行矢量化的方法：

import time

start = time.time()

topLimit = 0.55
bottomLimit = 0.45

for outI in range(200):
    for midI in range(200):
        topLimit = 0.55
        bottomLimit = 0.45
        res = np.random.rand(200,3)
        left = res[:,0]        
        right = res[:,1]
        valList = res[:,2]

        # Arrays containing where to go left and when to go right
        leftListBool = ((left > topLimit) & (right < bottomLimit))
        rightListBool = ((right > topLimit) & (left < bottomLimit))

        # Exclude all points that are neither right or left
        common = leftListBool | rightListBool
        valList = valList[common]
        leftListBool = leftListBool[common]
        rightListBool = rightListBool[common]

        # Remove the values where you would go right or left multiple times in a row
        leftListBool[1:] &= leftListBool[1:] ^ leftListBool[:-1]
        rightListBool[1:] &= rightListBool[1:] ^ rightListBool[:-1]
        valList = valList[leftListBool | rightListBool]

        # Just use np.sum to calculate the sum of the remaining items
        a = np.sum(valList)

end = time.time()
print(end - start)

内部循环是完全矢量化的，并且（在我的计算机上）方法比原始代码快3倍。如果我需要添加有关某些部分的更多说明，请告诉我。 ^（xor运算符）只是np.diff的一种更高效的方式，仅适用于布尔数组。

优化一个执行很多次的算法

最内层循环包含以下逻辑：

我尝试了什么？

完整的可运行代码：

2 个答案: