Question

重新发布其他细节，这些细节会大大改变我的第一个问题的范围。这是原始代码：

K = zeros(N*N)
for a=1:N
    for i=1:I
        for j=1:J
            M = kron(X(:,:,a).',Y(:,:,a,i,j));

            pr = real(trace(E*M));
            K = K+H(i,j,a)*M/pr;
        end
    end
end

其中E是布尔掩码，H是包含N IxJ直方图的3D矩阵。 K是输出

目标是对kroniker乘法调用进行矢量化。我的直觉是将X和Y视为矩阵的容器（作为参考，X和Y的切片被馈送到kron是7x7阶的矩阵）。在此容器方案下，X显示为1-D容器，Y显示为3-D容器。我的下一个猜测是将Y重塑为2-D容器或更好的1-D容器，然后进行X和Y的元素乘法。问题是：如何以保留M和M的轨迹的方式重塑matlab甚至可以在这个容器构思中处理这个想法，还是需要进一步重塑容器以进一步暴露内部矩阵元素？

Answer 1

使用7D置换

的矩阵乘法

% Get sizes
[m1,m2,~] =  size(X);
[n1,n2,N,n4,n5] =  size(Y);

% Perform kron format elementwise multiplication betwen the first two dims
% of X and Y, keeping the third dim aligned and "pushing out" leftover dims
% from Y to the back
mults = bsxfun(@times,permute(X,[4,2,5,1,3]),permute(Y,[1,6,2,7,3,4,5]));
mults3D = reshape(mults,m1*n1,m2*n2,[]);
Emults3D = reshape(E*reshape(mults3D,size(mults3D,1),[]),size(mults3D));

% Trace summations by using linear indices of diagonal on 3D slices in Emults3D
MN = m1*n1;
idx = 1:MN+1:MN^2;
idx2D = bsxfun(@plus,idx(:),MN^2*(0:size(Emults3D,3)-1));
pr_sums = sum(Emults3D(idx2D),1);

% Perform "M/pr" equivalent elementwise divisions and then use
% matrix-multiplication to reduce the iterative summations
Mp = bsxfun(@rdivide,mults3D,reshape(pr_sums,1,1,[]));
out = reshape(Mp,[],size(Mp,3))*reshape(permute(H,[3,1,2]),[],1);
out = reshape(out,m1*n1,m2*n2);

<强>基准

输入设置如此 -

% Size parameter
n = 5;

% Setup inputs
X = rand(n,n,n);
Y = rand(n,n,n,n,n);
E = rand(n*n,n*n)>0.5;
H = rand(n,n,n);
num_iter = 500; % Number of iterations to run the approaches for

运行时结果是 -

----------------------------- With Loop
Elapsed time is 8.806286 seconds.
----------------------------- With Vectorization
Elapsed time is 1.471877 seconds.

将大小参数n设置为10，运行时间为 -

----------------------------- With Loop
Elapsed time is 5.068872 seconds.
----------------------------- With Vectorization
Elapsed time is 4.399783 seconds.

使用跟踪计算矢量化kroniker multiplcation

1 个答案: