我想使用matlab和matconvnet-1.0-beta25训练我的网络。
我的问题是回归,我使用pdist
作为损失函数来获取mse。
输入数据为56*56*64*6000
,目标数据为56*56*64*6000
,网络架构如下:
opts.networkType = 'simplenn' ;
opts = vl_argparse(opts, varargin) ;
lr = [.01 2] ;
% Define network CIFAR10-quick
net.layers = {} ;
% Block 1
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(5,5,64,32, 'single'), zeros(1, 32, 'single')}}, ...
'learningRate', lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.05*randn(5,5,32,16, 'single'), zeros(1,16,'single')}}, ...
'learningRate', .1*lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(5,5,16,8, 'single'), zeros(1, 8, 'single')}}, ...
'learningRate', lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.05*randn(5,5,8,16, 'single'), zeros(1,16,'single')}}, ...
'learningRate', .1*lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.01*randn(5,5,16,32, 'single'), zeros(1, 32, 'single')}}, ...
'learningRate', lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'conv', ...
'weights', {{0.05*randn(5,5,32,64, 'single'), zeros(1,64,'single')}}, ...
'learningRate', .1*lr, ...
'stride', 1, ...
'pad', 2) ;
net.layers{end+1} = struct('type', 'relu') ;
% Loss layer
net.layers{end+1} = struct('type', 'pdist') ;
% Meta parameters
net.meta.inputSize = [56 56 64] ;
net.meta.trainOpts.learningRate = [0.0005*ones(1,30) 0.0005*ones(1,10) 0.0005*ones(1,5)] ;
net.meta.trainOpts.weightDecay = 0.0001 ;
net.meta.trainOpts.batchSize = 100 ;
net.meta.trainOpts.numEpochs = numel(net.meta.trainOpts.learningRate) ;
% Fill in default values
net = vl_simplenn_tidy(net) ;
我更改了getSimpleNNBatch(imdb, batch)
(我的名字)中的ncnn_train
功能,如下所示:
function [images, labels] = getSimpleNNBatch(imdb, batch)
images = imdb.images.data(:,:,:,batch) ;
labels = imdb.images.labels(:,:,:,batch) ;
if rand > 0.5, images=fliplr(images) ;
end
因为我的标签是多维的。
我还将errorFunction
中的cnn_train
从multiclasses
更改为none
:
opts.errorFunction = 'none' ;
并更改error
变量:
% accumulate errors
error = sum([error, [...
sum(double(gather(res(end).x))) ;
reshape(params.errorFunction(params, labels, res),[],1) ; ]],2) ;
为:
% accumulate errors
error = sum([error, [...
mean(mean(mean(double(gather(res(end).x))))) ;
reshape(params.errorFunction(params, labels, res),[],1) ; ]],2) ;
我的第一个问题是为什么上面命令中的res(end).x
第三维是1而不是64?这是56*56*1*100
(100是批次)。
我犯了错误吗?
结果如下:
train: epoch 01: 2/ 40: 10.1 (27.0) Hz objective: 21360.722
train: epoch 01: 3/ 40: 13.0 (30.0) Hz objective: 67328685.873
...
train: epoch 01: 39/ 40: 29.7 (29.6) Hz objective: 5179175.587
train: epoch 01: 40/ 40: 29.8 (30.6) Hz objective: 5049697.440
val: epoch 01: 1/ 10: 87.3 (87.3) Hz objective: 49.512
val: epoch 01: 2/ 10: 88.9 (90.5) Hz objective: 50.012
...
val: epoch 01: 9/ 10: 88.2 (88.2) Hz objective: 49.936
val: epoch 01: 10/ 10: 88.1 (87.3) Hz objective: 49.962
train: epoch 02: 1/ 40: 30.2 (30.2) Hz objective: 49.650
train: epoch 02: 2/ 40: 30.3 (30.4) Hz objective: 49.704
...
train: epoch 02: 39/ 40: 30.2 (31.6) Hz objective: 49.739
train: epoch 02: 40/ 40: 30.3 (31.0) Hz objective: 49.722
val: epoch 02: 1/ 10: 91.8 (91.8) Hz objective: 49.687
val: epoch 02: 2/ 10: 92.0 (92.2) Hz objective: 49.831
...
val: epoch 02: 9/ 10: 92.0 (88.5) Hz objective: 49.931
val: epoch 02: 10/ 10: 91.9 (91.1) Hz objective: 49.962
train: epoch 03: 1/ 40: 31.7 (31.7) Hz objective: 49.014
train: epoch 03: 2/ 40: 31.2 (30.8) Hz objective: 49.237
...
答案 0 :(得分:0)
pdist
的两个输入大小如下nxmx64x100
和this提到的pdist
的输出具有相同的高度和宽度,但深度等于一。关于错误定义的正确性,您应该准确地调试和检查大小和定义。