Question

我过去曾在本网站上提出过几个关于神经网络的问题并得到了很好的答案，但我仍然在努力为自己实施一个。这是一个相当长的问题，但我希望它可以作为其他人在MATLAB中创建自己的基本神经网络的指南，所以它应该是值得的。

到目前为止，我所做的完全错了。我正在跟随Andrew Y. Ng教授的在线斯坦福机器学习课程，并试图尽我所能地实施他所教授的内容。

您能否告诉我代码的前馈和成本函数部分是否正确，以及我在最小化（优化）部分出错的地方？

我有一个饲料2层前馈神经网络。

前馈部分的MATLAB代码为：

function [ Y ] = feedforward2( X,W1,W2)
%This takes a row vector of inputs into the neural net with weight matrices W1 and W2 and returns a row vector of the outputs from the neural net

%Remember X, Y, and A can be vectors, and W1 and W2 Matrices 

X=transpose(X);            %X needs to be a column vector
A = sigmf(W1*X,[1 0]);     %Values of the first hidden layer  
Y = sigmf(W2*A,[1 0]);     %Output Values of the network
Y = transpose(Y);          %Y needs to be a column vector

例如，具有两个输入和两个输出的双层神经网络看起来有点像这样：

      a1
x1 o--o--o y1      (all weights equal 1)
    \/ \/
    /\ /\
x2 o--o--o y2
      a2

如果我们输入：

X=[2,3];
W1=ones(2,2);
W2=ones(2,2);

Y = feedforward2(X,W1,W2)

我们得到输出：

Y = [0.5,0.5]

这表示神经网络图中显示的y1和y2值

平方误差成本函数的MATLAB代码为：

function [ C ] = cost( W1,W2,Xtrain,Ytrain )
%This gives a value seeing how close W1 and W2 are to giving a network that represents the Xtrain and Ytrain data
%It uses the squared error cost function
%The closer the cost is to zero, the better these particular weights are at giving a network that represents the training data
%If the cost is zero, the weights give a network that when the Xtrain data is put in, The Ytrain data comes out

M = size(Xtrain,1);  %Number of training examples

oldsum = 0;

for i = 1:M,
        H = feedforward2(Xtrain,W1,W2); 
        temp = ( H(i) - Ytrain(i) )^2;
        Sum = temp + oldsum;
        oldsum = Sum;
end

C = (1/2*M) * Sum;

end

示例

例如，如果训练数据是：

Xtrain =[0,0;        Ytrain=[0/57;
        1,2;           3/57;
        4,1;           5/57;
        5,2;           7/57;                                                           a1    
        3,4;           7/57;    %This will be for a two input one output network  x1 o--o y1
        5,3;           8/57;                                                          \/ \_o 
        1,5;           6/57;                                                          /\ /
        6,2;           8/57;                                                      x2 o--o      
        2,1;           3/57;                                                           a2    
        5,5;]          10/57;]

我们从初始随机权重开始

W1=[2,3;     W2=[3,2]
    4,1]

如果我们输入：

Y= feedforward2([6,2],W1,W2)

我们得到了

Y = 0.9933

这远远不是训练数据应该是什么（8/57 = 0.1404）。所以初始随机权重W1和W2的猜测很差。

为了准确测量随机权重的估计有多糟糕/好，我们使用成本函数：

C= cost(W1,W2,Xtrain,Ytrain)

这给出了值：

C = 6.6031e+003

最大限度地降低成本函数

如果我们通过搜索所有可能的变量W1和W2然后选择最低值来最小化成本函数，这将使网络最接近训练数据

但是当我使用代码时：

 [W1,W2]=fminsearch(cost(W1,W2,Xtrain,Ytrain),[W1,W2])

它会显示错误消息。它说：“使用horzcat时出错.CAT参数维度不一致。”为什么我会收到此错误，我该怎么做才能修复它？

您能否告诉我代码的前馈和成本函数部分是否正确，以及我在最小化（优化）部分出错的地方？

谢谢!!!

Answer 1

您的神经网络似乎没问题，但是如果您正在进行针对标记数据的培训，那么您尝试进行的培训效率非常低。在这种情况下，我建议调查Back-propagation

关于培训时的错误：您的错误消息提示问题：dimensions are not consistent

作为x0中的参数fminsearch，这是优化器的初始猜测，您发送[W1, W2]，但从我看到的情况来看，这些矩阵的行数不同，因此你不能像那样把它们加在一起。我建议修改你的成本函数，把一个向量作为参数，然后从那个向量形成不同层的权重向量。

您还没有正确地向fminsearch提供费用函数，因为您只是使用w1，w2，Xtrain和Ytrain就地评估cost。

根据documentation（自从我使用Matlab以来已经过了好几年），您似乎将指针传递给了成本函数 fminsearch(cost, [W1; W2])

编辑：你可以表达你的权重并修改你的代码如下：

global Xtrain
global Ytrain
W = [W1; W2]
fminsearch(cost, W)

必须修改成本函数，使其不会将Xtrain，Ytrain作为输入，因为fminsearch将尝试优化它们。像这样修改你的成本函数：

function [ C ] = cost( W )
   W1 = W[1:2,:]
   W2 = W[3,:]
   global Xtrain
   global Ytrain
   ...

在MATLAB中从头开始编程基本神经网络

1 个答案: