Question

我需要在大型2D“ NUMPY”数组上评估许多逻辑条件，并将总体结果收集在布尔型“ RESULT” numpy数组中。

一个简单的示例，其中所有条件都与AND语句链接在一起。

RESULT = cond1（NUMPY）＆cond2（NUMPY）＆cond3（NUMPY）＆....

我想了解是否有一种优化性能的方法。

例如，在这种情况下，如果对于NUMPY数组中的大多数值，第一个条件（cond1）为False，则由于AND条件始终会生成False，因此浪费了评估这些值上所有其他条件的资源。在最终的RESULT数组中。

有什么想法吗？

Answer 1

您可以手动进行短路，尽管我应该补充一点，这可能仅在相当极端的情况下才值得。

这里是99个链接逻辑与的示例。使用where关键字或使用花式索引进行短路。在本示例中，第二个而不是第一个可以提供不错的速度。

import numpy as np

a = np.random.random((1000,))*1.5
c = np.random.random((100, 1))*1.5

def direct():
    return ((a+c) < np.arccos(np.cos(a+c)*0.99)).all(0)

def trickya():
    out = np.ones(a.shape, '?')
    for ci in c:
        np.logical_and(out, np.less(np.add(a, ci, where=out), np.arccos(np.multiply(np.cos(np.add(a, ci, where=out), where=out), 0.99, where=out), where=out), where=out), out=out, where=out)
    return out

def trickyb():
    idx, = np.where((a+c[0]) < np.arccos(np.cos(a+c[0])*0.99))
    for ci in c[1:]:
        idx = idx[(a[idx]+ci) < np.arccos(np.cos(a[idx]+ci)*0.99)]
    out = np.zeros(a.shape, '?')
    out[idx] = True
    return out

assert (direct()==trickya()).all()
assert (direct()==trickyb()).all()

from timeit import timeit

print('direct  ', timeit(direct, number=100))
print('where kw', timeit(trickya, number=100))
print('indexing', timeit(trickyb, number=100))

样品运行：

direct   0.49512664100620896
where kw 0.494946873979643
indexing 0.17760096595156938

改善numpy阵列上复杂逻辑条件的性能

1 个答案: