似乎无法在Python中正确实现L2正则化 - 低精度分数

时间:2017-09-18 02:19:39

标签: python neural-network deep-learning regularized

我试图将正则化添加到我的Mnist数字NN分类器中,我使用numpy和vanilla Python创建了它。我目前正在使用具有交叉熵成本函数的Sigmoid激活。

不使用正则化器,我的准确率达到97%。

然而,一旦我添加了正则化器,我只能得到大约11%,尽管使用不同的超参数。我尝试过不同的学习率:

.001,.1,1

和不同的lambd值,例如:

.5,.8,1.0,2.0等。

我似乎无法弄清楚我犯的是什么错误。我觉得我可能错过了一个步骤吗?

我所做的唯一改变是权重的衍生物。我已经实现了如下渐变:

def calculate_gradients(self,x, y, lambd):


        '''calculate all gradients with respect to
        cost. Here our cost function is cross_entropy

        last_layer_z_error = dC/dZ  (z is logit)
        All weight gradients also include regularization gradients

         x.shape[0]  = len of sample size

        '''



##### First we calculate the output layer gradients #########

        gradients, activations, zs = self.gather_backprop_data(x,y)

        #gradient of cost with respect to  Z of last layer
        last_layer_z_error = ((activations[-1] - y)) 



        #updating the weight_derivatives of final layer
        gradients['w'+ str(self.num_layers -1)] = np.dot(activations[-2].T,last_layer_z_error)/x.shape[0] + (lambd/x.shape[0])*(self.parameters['w'+ str(self.num_layers -1)])

        gradients['b'+ str(self.num_layers -1)] = np.mean(last_layer_z_error, axis =0)
        gradients['b'+ str(self.num_layers -1)] = np.expand_dims(gradients['b'+ str(self.num_layers -1)],0)


###HIDDEN LAYER GRADIENTS###

        z_previous_layer = last_layer_z_error



        for i in reversed(range(1,self.num_layers -1)):
            z_previous_layer =np.dot(z_previous_layer,self.parameters['w'+ str(i+1)].T, )*\
                                 (sigmoid_derivative(zs[i-1]))

            gradients['w'+str(i)] = np.dot((activations[i-1].T),z_previous_layer)/x.shape[0] + (lambd/x.shape[0])*(self.parameters['w'+str(i)])
            gradients['b'+str(i)] = np.mean(z_previous_layer, axis =0) 
            gradients['b'+str(i)] = np.expand_dims(gradients['b'+str(i)],0)


        return gradients

整个代码可以在这里找到:

如果需要,我已将整个笔记本上传到Github:

https://github.com/moondra2017/Neural-Networks-from-scratch/blob/master/Neural%20Network%20from%20scratch-Testing%20expanded%20Mnist-Sigmoid%20with%20cross-entroupy-with%20L2%20regularization.ipynb

0 个答案:

没有答案