arithemtic表达式中的ufunc内存消耗

时间:2018-05-25 11:54:17

标签: python numpy copy numpy-ufunc

算术numpy表达式的内存消耗是什么?

// Make a GET request.
 UrlFetchApp.fetch('http://www.google.com/');

(vec是一个numpy.ndarray)。是否为每个中间操作存储了一个数组?这些复合表达式的内存可能比底层的ndarray多倍吗?

1 个答案:

答案 0 :(得分:2)

你是对的,将为每个中间结果分配一个新数组。幸运的是,程序包numexpr旨在解决此问题。从描述:

  

NumExpr实现比NumPy更好的性能的主要原因是它避免为中间结果分配内存。这样可以提高缓存利用率并减少内存访问。因此,NumExpr最适合大型阵列。

示例:

In [97]: xs = np.random.rand(1_000_000)

In [98]: %timeit xs ** 3 + xs ** 2 + xs
26.8 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [99]: %timeit numexpr.evaluate('xs ** 3 + xs ** 2 + xs')
1.43 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

感谢@ max9111指出,numexpr简化了乘法功能。似乎基准测试中的大部分差异都是通过优化xs ** 3来解释的。

In [421]: %timeit xs * xs
1.62 ms ± 12 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [422]: %timeit xs ** 2
1.63 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [423]: %timeit xs ** 3
22.8 ms ± 283 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [424]: %timeit xs * xs * xs
2.52 ms ± 58.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)