计算CNN实现中的卷积层

时间:2015-05-12 14:36:06

标签: matlab neural-network

我正在尝试使用稀疏自动控制器训练卷积神经网络,以便计算卷积层的滤波器。我正在使用UFLDL代码来构建补丁并训练CNN网络。我的代码如下:

===========================================================================
imageDim = 30;         % image dimension
imageChannels = 3;     % number of channels (rgb, so 3)

patchDim = 10;          % patch dimension
numPatches = 100000;    % number of patches

visibleSize = patchDim * patchDim * imageChannels;  % number of input units 
outputSize = visibleSize;   % number of output units
hiddenSize = 400;           % number of hidden units 

epsilon = 0.1;         % epsilon for ZCA whitening

poolDim = 10;          % dimension of pooling region

optTheta =  zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);
ZCAWhite =  zeros(visibleSize, visibleSize);
meanPatch = zeros(visibleSize, 1);

load patches_16_1
===========================================================================

% Display and check to see that the features look good
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b =     optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);

displayColorNetwork( (W*ZCAWhite));

stepSize = 100; 
assert(mod(hiddenSize, stepSize) == 0, stepSize should divide hiddenSize);

load train.mat % loads numTrainImages, trainImages, trainLabels
load train.mat  % loads numTestImages,  testImages,  testLabels
% size 30x30x3x8862

numTestImages = 8862;
numTrainImages = 8862;

pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, floor((imageDim -     patchDim + 1) / poolDim), floor((imageDim - patchDim + 1) / poolDim) );
pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...
floor((imageDim - patchDim + 1) / poolDim), ...
floor((imageDim - patchDim + 1) / poolDim) );

 tic();

 testImages = trainImages;

for convPart = 1:(hiddenSize / stepSize)

 featureStart = (convPart - 1) * stepSize + 1;
 featureEnd = convPart * stepSize;

  fprintf('Step %d: features %d to %d\n', convPart, featureStart, featureEnd);  
  Wt = W(featureStart:featureEnd, :);
  bt = b(featureStart:featureEnd);    

  fprintf('Convolving and pooling train images\n');
  convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
    trainImages, Wt, bt, ZCAWhite, meanPatch);
  pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
  pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
  toc();
  clear convolvedFeaturesThis pooledFeaturesThis;

  fprintf('Convolving and pooling test images\n');
  convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...
    testImages, Wt, bt, ZCAWhite, meanPatch);
  pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);
  pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;   
  toc();

  clear convolvedFeaturesThis pooledFeaturesThis;

 end

我在计算卷积和合并图层时遇到问题。我正在收集pooledFeaturesTrain(featureStart:featureEnd,:,:,:) = pooledFeaturesThis;下标分配尺寸不匹配。路径通常已经计算出来,它们是:

enter image description here

我试图了解convPart变量究竟在做什么以及pooledFeaturesThis是什么。其次我注意到我的问题是这一行不匹配pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; 我得到的信息是变量不匹配。 pooledFeatures的大小这是100x3x2x2,其中pooledFeaturesTrain的大小是400x8862x2x2。究竟pooledFeaturesTrain代表什么?每个过滤器的结果是2x2吗?可以找到CnnConvolve here

编辑:我已经改变了一点我的代码并且它有效。但是我有点担心代码的理解。

1 个答案:

答案 0 :(得分:1)

好的,在这一行中你要设置池区域。

poolDim = 10;          % dimension of pooling region

这部分意味着对于每个图层中的每个内核,您将获取图像和池,以及10x10像素的区域。从你的代码看起来你正在应用一个平均函数,这意味着它是一个补丁并计算平均值并在下一层输出... ...也就是说,从100x100到10x10的图像。在您的网络中,您正在重复卷积+池化,直到您根据此输出得到2x2图像(顺便说一下,根据我的经验,这通常不是很好的做法)。

400x8862x2x2

无论如何回到你的代码。请注意,在训练开始时,您将执行以下初始化:

 pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, floor((imageDim -     patchDim + 1) / poolDim), floor((imageDim - patchDim + 1) / poolDim) );

因此,您的错误非常简单和正确 - 保持卷积+池输出的矩阵大小不是您初始化的矩阵的大小。

现在的问题是如何解决它。我认为一个懒人修复它的方法是取出初始化。它会大大减慢您的代码速度,如果您有多个图层,则无法保证其工作。

我建议你改为pooledFeaturesTrain是一个三维数组的结构。所以不是这个

pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; 

你会做更多的事情:

pooledFeaturesTrain{n}(:, :, :) = pooledFeaturesThis; 

其中n是当前图层。

美国有线电视新闻网(CNN)的网络并不像它们被破解那么容易 - 即使它们没有让它们进行良好的训练也是一种壮举。我强烈建议阅读有关CNN的理论 - 它将使编码和调试变得更加容易。

祝你好运! :)