我能做些什么来加速numpy中的蒙面数组?我有一个非常低效的函数,我重新编写使用掩码数组(我可以掩盖行,而不是像我一样掩饰行而不是复制和删除行)。但是,我很震惊地发现掩码函数慢了10倍,因为掩码数组的速度要慢得多。
举一个例子,采取以下措施(掩盖对我来说慢了6倍):
import timeit
import numpy as np
import numpy.ma as ma
def test(row):
return row[0] + row[1]
a = np.arange(1000).reshape(500, 2)
t = timeit.Timer('np.apply_along_axis(test, 1, a)','from __main__ import test, a, np')
print round(t.timeit(100), 6)
b = ma.array(a)
t = timeit.Timer('ma.apply_along_axis(test, 1, b)','from __main__ import test, b, ma')
print round(t.timeit(100), 6)
答案 0 :(得分:4)
我不知道为什么屏蔽数组函数的移动速度如此之慢,但是因为它听起来像是使用掩码来选择行(而不是单个值),所以你可以从被屏蔽的行创建一个常规数组并使用改为使用np函数:
b.mask = np.zeros(500)
b.mask[498] = True
t = timeit.Timer('c=b.view(np.ndarray)[~b.mask[:,0]]; np.apply_along_axis(test, 1, c)','from __main__ import test, b, ma, np')
print round(t.timeit(100), 6)
更好的是,根本不要使用蒙面数组;只需将数据和一维掩码数组维护为单独的变量:
a = np.arange(1000).reshape(500, 2)
mask = np.ones(a.shape[0], dtype=bool)
mask[498] = False
out = np.apply_along_axis(test, 1, a[mask])
答案 1 :(得分:2)
我所知道的最有效的方法是手动处理掩码。这是沿轴计算 masked mean
的简短基准。截至 2021 年(np.版本 1.19.2),手动实施速度提高了 3 倍。
值得注意的是,
np.nanmean
和 ma.mean
一样慢。但是,我没有找到简单的解决方法,因为 0 * nan -> nan
和 np.where
很耗时。opencv
通常有一个 mask argument 作为它的例程。但在大多数情况下,切换库可能不合适。benchmark manual (np.sum(..values..)/np.sum(..counts..))
time for 100x np_mean: 0.15721
benchmark ma.mean
time for 100x ma_mean: 0.580072
benchmark np.nanmean
time for 100x nan_mean: 0.609166
np_mean[:5]: [0.74468436 0.75447124 0.75628326 0.74990387 0.74708414]
ma_mean[:5]: [0.7446843592460088 0.7544712410870448 0.7562832614361736
0.7499038657880674 0.747084143818861]
nan_mean[:5]: [0.74468436 0.75447124 0.75628326 0.74990387 0.74708414]
np_mean == ma_mean: True
np_mean == nan_mean: True
np.__version__: 1.19.2
import timeit
import numpy as np
import numpy.ma as ma
np.random.seed(0)
arr = np.random.rand(1000, 1000)
msk = arr > .5 # POSITIV mask: only emelemts > .5 are processed
print('\nbenchmark manual (np.sum(..values..)/np.sum(..counts..))')
np_mean = np.sum(arr * msk, axis=0)/np.sum(msk, axis=0)
t = timeit.Timer('np_mean = np.sum(arr * msk, axis=0)/np.sum(msk, axis=0)', globals=globals())
print('\ttime for 100x np_mean:', round(t.timeit(100), 6))
print('\nbenchmark ma.mean')
ma_arr = ma.masked_array(arr, mask=~msk)
ma_mean = ma.mean(ma_arr, axis=0)
t = timeit.Timer('ma_mean = ma.mean(ma_arr, axis=0)', globals=globals())
print('\ttime for 100x ma_mean:', round(t.timeit(100), 6))
print('\nbenchmark np.nanmean')
nan_arr = arr.copy()
nan_arr[~msk] = np.nan
nan_mean = np.nanmean(nan_arr, axis=0)
t = timeit.Timer('nan_mean = np.nanmean(nan_arr, axis=0)', globals=globals())
print('\ttime for 100x nan_mean:', round(t.timeit(100), 6))
print('\n')
print('np_mean[:5]:', np_mean[:5])
print('ma_mean[:5]:', ma_mean[:5])
print('nan_mean[:5]:', nan_mean[:5])
print('np_mean == ma_mean: ', (np_mean == ma_mean).all())
print('np_mean == nan_mean: ', (np_mean == nan_mean).all())
print('np.__version__:', np.__version__)
manaul 版本仅在数组中没有 nans
时才有效。
如果 arr
包含 nans
:
只需通过 msk = np.isnan(arr)
构造掩码,然后将 arr
中的 nan 替换为 arr = np.nan_to_num(arr, copy=False, nan=0)
。
答案 2 :(得分:0)
In [16]: %timeit np.apply_along_axis(test, 1, a)
100 loops, best of 3: 15.3 ms per loop
In [17]: %timeit np.apply_along_axis(test, 1, b)
100 loops, best of 3: 15.3 ms per loop
In [12]: %timeit np.ma.apply_along_axis(test, 1, b)
10 loops, best of 3: 80.8 ms per loop
In [18]: np.__version__
Out[18]: '1.5.1'