对于nxN
>> N
的{{1}}矩阵,我注意到matlab n
效率不高。例如,我们可以考虑:
sum()
在所有情况下,显式求和所花费的时间较少,但我们可以看到N = 10000000;
T = 30;
c=rand(2,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:);end;toc
> Elapsed time is 1.250268 seconds.
> Elapsed time is 0.567871 seconds.
c=rand(3,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:)+c(3,:);end;toc
> Elapsed time is 1.514810 seconds.
> Elapsed time is 0.821631 seconds.
c=rand(4,N);
tic;for ii=1:T;d=sum(c);end;toc
tic;for ii=1:T;d=c(1,:)+c(2,:)+c(3,:)+c(4,:);end;toc
> Elapsed time is 1.519009 seconds.
> Elapsed time is 1.069865 seconds.
最终将获胜,因为sum
会进一步增加。
为什么n
效率不高?
此外,sum
似乎没有从更多计算线程中受益。例如,
sum
我想这有点合理,因为并行化可能仅在c=rand(10,N);
maxNumCompThreads(2);
tic;for ii=1:T;d=sum(c);end;toc
maxNumCompThreads(1);
tic;for ii=1:T;d=sum(c);end;toc
> Elapsed time is 2.496837 seconds.
> Elapsed time is 2.450345 seconds.
很大时启动。
如果n
仍然很小(例如n
),有没有办法让这个计算从多线程中受益?还是有更好的策略?
非常感谢!