Question

我正在尝试编写我的第一个神经网络，但现在已经完全坚持了这个问题一个多星期了。我正在跟随Andrew NG关于机器学习的课程，我在python中实现了以下功能。

    forwardPropogate() #does forward propagation
    backwardPropogate() #computes the gradients using backpropogation
    costFunction() #takes as input, all the parameters of the neural network in a rolled up single array and computes its cost
    gradientDescend() #tries to minimise the cost using gradient descend

当我尝试网络训练，我发现，这是给我非常糟糕的结果，当我不能＆＃39;吨弄清楚什么是错的代码，我下载的代码的MATLAB版本，并试图用比较它我自己的。

为了确保我的实现是正确的，我运行了MATLAB代码，从中获取参数并通过我的backwardPropogate()和costFunction()运行。

运行backwardPropogate()这是MATLAB和我自己的代码给出的渐变图。如您所见，它们非常相似。此外，我还完成了两个输出的手动启动，足以说服我backwardPropogate()正确实现。我也进行了数值梯度检查，并且匹配得非常好。

MATLAB代码找到的参数成本为J = 0.14942，Python给出J = 0.149420032652。我确信costFunction()和backwardPropogate()已正确实施（我不应该这样做吗？）。

当我运行我的gradientDescend()时，我会得到这个成本值的图表与迭代次数。。这再次看起来不错。

我无法理解为什么代码仍然给我不好的价值。即使在训练集上，成功率也接近10％。

这是我的Gradient Descend和它的调用。

   def gradientDescend(self,gradientFunction,thetaValues):

        JValues = np.zeros(MAX_ITER)

        for i in range(0,MAX_ITER):            
            thetaValues = thetaValues - ALPHA * gradientFunction(thetaValues)

            J = self.costFunction(thetaValues)
            JValues[i] = J

            print i/MAX_ITER * 100 #show percentage completed

        return thetaValues,JValues

    def train(self):

        thetaValues = (np.random.rand(NoTheta1+NoTheta2,1) * (2 * INIT_EPSILON)) - INIT_EPSILON 

        trainedThetas,JVals = self.gradientDescend(self.getGradients,thetaValues)        
        self.theta1,self.theta2 = self.unrollParameters(thetaValues)

        xaxis = np.arange(0,len(JVals))
        plt.plot(xaxis,JVals)
        plt.show()

        return self.theta1,self.theta2

经过进一步检查，我发现我们所做参数的初始随机值和我训练过的参数一样糟糕！在所有事情中，这是我最不了解的。成本函数似乎从循环开始到结束都在减少。因此，即使最终参数不好，它们至少应该比最初的参数做得更好。我不知道从哪里开始。欢迎大家提出意见。

Answer 1

在train()中，GradientDescend()函数的输出trainedThetas并未实际使用。在GradientDescend()之后的行中，self.unrollParameters(thetaValues)采用thetaValues的原始随机向量。这就是为什么你没有看到你的成本函数有任何学习或改进的原因。

将thetaValues替换为trainedValues中的unrollParameters()，您就可以了。

即使一切似乎都在起作用，神经网络也会给出糟糕的预测

1 个答案: