我发现numpy.sin
的行为在参数大小<= 8192和> 8192时有所不同。两者的区别在于性能和返回值。有人可以解释这种影响吗?
例如,让我们计算sin(pi / 4):
x = np.pi*0.25
for n in range(8191, 8195):
xx = np.repeat(x, n)
%timeit np.sin(xx)
print(n, np.sin(xx)[0])
64.7 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.7071067811865476
64.6 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.7071067811865476
20.1 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.7071067811865475
21.8 µs ± 13.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8194 0.7071067811865475
越过8192个元素限制后,计算速度将加快3倍以上,并得出不同的结果:最后一位变为5而不是6。
当我尝试通过其他方式计算相同的值时,获得了
std::sin
(Visual Studio 2017,Win32平台)给出0.7071067811865475; std::sin
(Visual Studio 2017,x64平台)给出0.70710678118654756; math.sin
给出0.7071067811865476,这是合理的,因为我使用了64位Python。我在NumPy文档及其代码中都找不到任何解释。
更新#2:难以置信,但是用sin
代替sqrt
可以做到:
44.2 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8191 0.8862269254527579
44.1 µs ± 543 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8192 0.8862269254527579
10.3 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8193 0.886226925452758
10.4 µs ± 4.41 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
8194 0.886226925452758
更新:np.show_config()
输出:
mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\\Library\\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\lib', 'C:/GNU/Anaconda3\\Library\\include']
blas_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\\Library\\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\lib', 'C:/GNU/Anaconda3\\Library\\include']
blas_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\\Library\\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\lib', 'C:/GNU/Anaconda3\\Library\\include']
lapack_mkl_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\\Library\\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\lib', 'C:/GNU/Anaconda3\\Library\\include']
lapack_opt_info:
libraries = ['mkl_rt']
library_dirs = ['C:/GNU/Anaconda3\\Library\\lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019.0.117\\windows\\mkl\\lib', 'C:/GNU/Anaconda3\\Library\\include']
答案 0 :(得分:2)
正如@WarrenWeckesser所说,“几乎可以肯定是Anaconda和Intel MKL问题;请参阅https://github.com/numpy/numpy/issues/11448和https://github.com/ContinuumIO/anaconda-issues/issues/9129”。
不幸的是,解决Windows下问题的唯一方法是卸载Anaconda,并使用不带MKL的numpy
发行版。我使用了https://www.python.org/中的python-3.6.6-amd64并通过pip
安装了其他所有程序,包括numpy 1.14.5。我什至设法使Spyder正常工作(不得不将PyQt5降级到5.11.3,它拒绝在> = 5.12上启动)。
现在np.sin(xx)
始终为0.7071067811865476(n = 8192
处为67.1 µs)和np.sqrt(xx)
始终为0.8862269254527579(16.4 µs)。慢一点,但可完美复制。