Question

我想知道为什么MATLAB在cholesky factorization上比Python和R快得多，即使至少Python也在使用英特尔的MKL。

MATLAB:        Python:         R:
----------------------------------------------
8.0034e-4 s    30.7093e-4 s    105.0849e-4 s    
100.00 %       383.70 %        1313.00 %

作为一个例子，我使用了MATLAB生成的矩阵：

matrix = gallery('moler',600)

（600因为我实际代码中的矩阵有这个大小）

并将其存储为.csv文件。

代码：

MATLAB：

filename = './matrix.csv';
delimiter = ',';

formatSpec = [repmat('%f', 1,600) '%[^\n\r]'];

fileID = fopen(filename,'r');

dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'EmptyValue' ,NaN, 'ReturnOnError', false);

fclose(fileID);

x = [dataArray{1:end-1}];

clearvars filename delimiter formatSpec fileID dataArray;

t=zeros(1000, 1);
for i=1:length(t)
    tic;
    chol(x,'lower');
    t(i)=toc;
end;

disp(['Mean: ' num2str(mean(t)) [' s']])
>> test
Mean: 0.00080034 s

的Python：

import time
import numpy as np

x = np.loadtxt('./matrix.csv', delimiter=',')
time_elapsed = np.zeros(1000)
for i in range(0, len(time_elapsed)):
    t = time.time()
    np.linalg.cholesky(x)
    time_elapsed[i] = time.time() - t
print('Mean: '+str(time_elapsed.mean())+" s")

Python ./test.py
Mean: 0.00307093310356 s

R：

library(readr)
library(tictoc)
x <- read_csv("./matrix.csv", col_names = FALSE)
tic.clearlog()
for (i in 0:1000)
{
  tic(i)
  chol(x)
  toc(log = TRUE, quiet = TRUE)
}
log.lst <- tic.log(format = FALSE)
tic.clearlog()
timings <- unlist(lapply(log.lst, function(x) x$toc - x$tic))
print(paste0("Mean: ", mean(timings), " s"))

source('./test.R')
[1] "Mean: 0.0105084915084904 s"

系统信息：

Intel（R）Xeon（R）CPU E5-2643 v3 @ 3.40GHz，6核心HT启用，64 GB Ram

OS：

uname -a
Linux 4.8.0-2-amd64 #1 SMP Debian 4.8.15-2 (2017-01-04) x86_64 GNU/Linux

MATLAB：

version
9.1.0.441655 (R2016b)

version -java
Java 1.7.0_60-b19 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode

version -blas
Intel(R) Math Kernel Library Version 11.3.1 Product Build 20151021 for Intel(R) 64 architecture applications, CNR branch AVX2

version -lapack
Intel(R) Math Kernel Library Version 11.3.1 Product Build 20151021 for Intel(R) 64 architecture applications, CNR branch AVX2
Linear Algebra PACKage Version 3.5.0

R：

version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          3.2                         
year           2016                        
month          10                          
day            31                          
svn rev        71607                       
language       R                           
version.string R version 3.3.2 (2016-10-31)
nickname       Sincere Pumpkin Patch

的Python：

conda --version
conda 4.3.11

python --version
Python 3.5.2 :: Anaconda custom (64-bit)
np.version.full_version
'1.11.2'

np.__config__.show()
blas_mkl_info:
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    library_dirs = ['~/anaconda3/lib']
    include_dirs = ['~/anaconda3/include']
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread']
lapack_mkl_info:
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    library_dirs = ['~/anaconda3/lib']
    include_dirs = ['~/anaconda3/include']
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread']
lapack_opt_info:
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    library_dirs = ['~/anaconda3/lib']
    include_dirs = ['~/anaconda3/include']
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread']
blas_opt_info:
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    library_dirs = ['~/anaconda3/lib']
    include_dirs = ['~/anaconda3/include']
    libraries = ['mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'iomp5', 'pthread']
openblas_lapack_info:
  NOT AVAILABLE

mkl.get_version_string()
'Intel(R) Math Kernel Library Version 11.3.3 Product Build 20160413 for Intel(R) 64 architecture applications'

Cholesky分解：为什么MATLAB比Python / R更快

0 个答案: