以下是神经网络前向传播的代码。我想加快速度。至于循环需要时间,任何身体都可以帮助纠正代码加速它,就像matlab说矢量化等。 在这段代码中,我每次从大小为19x19的输入获取4x4的接收场,而不是将每个像素乘以4x4的权重(net.w {layer_no}(u,v),大小为19x19)。你也可以说它是两者的点积。我没有直接做两个小矩阵的点积,因为有一个边界检查。它最终在输出中提供了6x6输出。我不是一个经验丰富的编码员,所以我尽我所能。任何人都可以指导我如何加快速度,因为与Opencv相比需要很多时间。会感恩的。此致
receptiveSize = 4;
overlap= 1;
inhibatory = 0;
gap = receptiveSize-overlap;
UpperLayerSize = size(net.b{layer_no}); % 6x6
Curr_layerSize = size(net.w{layer_no}); % 19x19
for u=1:UpperLayerSize(1)-1
for v=1:UpperLayerSize(2)-1
summed_value=0;
min_u = (u - 1) * gap + 1;
max_u = (u - 1) * gap + receptiveSize;
min_v = (v - 1) * gap + 1;
max_v = (v - 1) * gap + receptiveSize;
for i = min_u : max_u
for j = min_v : max_v
if(i>Curr_layerSize(1) || j>Curr_layerSize(2))
continue;
end
if(i<1 || j<1)
continue;
end
summed_value = summed_value + input{layer_no}.images(i,j,sample_ind) * net.w{layer_no}(i,j);
end
end
summed_value = summed_value + net.b{layer_no}(u,v);
input{layer_no+1}.images(u,v,sample_ind) = summed_value;
end
end
temp = activate_Mat(input{layer_no+1}.images(:,:,sample_ind),net.AF{layer_no});
output{layer_no}.images(:,:,sample_ind) = temp(:,:);
答案 0 :(得分:1)
如何将内部循环(循环遍历i
并循环遍历j
)更改为:
ii = max( 1, min_u ) : min( max_u, Curr_layerSize(1) );
jj = max( 1, min_v ) : min( max_v, Curr_layerSize(2) );
input{layer_no+1}.images(u,v,sample_ind) = ...
reshape( input{layer_no}.images(ii,jj,sample_ind), 1, [] ) * ...
reshape( net.w{layer_no}(ii,jj), [], 1 ) + ...
net.b{layer_no}(u,v); %// should this term be added rather than multiplied?