神经网络的损失没有减少

时间:2019-08-09 20:45:38

标签: python machine-learning neural-network deep-learning mnist

我正在尝试制作3层神经网络来对数字进行分类。我正在使用Mnist数据集。问题是损失函数冻结在1.000047619047619

我已经运行了7000次以上的程序。我最初使用40个隐藏单位,但尝试了100个和300个却没有成功。我的代码能够轻松解决XOR问题。

这是我创建NN并对其进行训练的代码

data = pd.read_csv("train.csv").values
xtrain = data[0:21000, 1:]
ytrain = data[0:21000, 0]
trainy = invec(ytrain).reshape(10, 21000).T

textrec = NN(xtrain, trainy, 21000, 1, [784, 300, 10], np.array([relu, softmax]), np.array([drelu]))
textrec.train()

NN类有7个输入:输入(在这种情况下为像素)X,输出y,训练示例的数量n,学习率a,层的大小Lsize激活函数作用及其导数dact

这是正向传播的代码

    def L1(self, x):
        return self.act[0](np.dot(x, self.w0_1) + self.b1)

    def L2(self, x):
        return self.act[1](np.dot(x, self.w1_2) + self.b2)

    def h(self, x):
        return self.L2(self.L1(x))

这是反向传播的代码

    def train(self):
        t = 0
        while self.dw0_1.all() != 0 or self.dw1_2.all != 0 or self.db1.all != 0 or self.db2.all != 0:

            X = self.X
            L2 = self.h(X)
            y = self.y
            L1 = self.L1(X)

            self.dz2 = (L2 - y)
            self.dw1_2 = np.matmul(self.dz2.T, L1)
            self.db2 = np.sum(self.dz2, axis = 1, keepdims = True)
            self.dz1 = np.matmul(self.w1_2, self.dz2.T) * self.dact[1](L1).T
            self.dw0_1 = np.matmul(self.dz1, X).T
            self.db1 = np.sum(self.dz1, axis = 1, keepdims = True).T

            self.dw0_1 = self.dw0_1.sum(axis = 1, keepdims = True)/self.n * self.a
            self.dw1_2 = self.dw1_2.sum(axis = 1, keepdims = True)/self.n * self.a
            self.db2 = self.db2/self.n * self.a
            self.db1 = self.db1/self.n * self.a

            self.w0_1 = self.w0_1 - self.dw0_1
            self.w1_2 = self.w1_2 - self.dw1_2.T
            self.b2 -= self.db2
            self.b1 -= self.db1
            #print(str([self.w0_1, self.w1_2, self.b1, self.b2]))
            #print(str([self.dw0_1, self.dw1_2, self.db1, self.db2]))
            print("Cost: " + str(((L2 - y) ** 2).sum()/self.n))
            t+=1

        print("finished running: " + str(t) + "times")

我每次都得到这个:

Cost: 1.000047619047619
Cost: 0.9999523809523809
Cost: 1.000047619047619
Cost: 1.000047619047619
Cost: 1.000047619047619
Cost: 1.000047619047619
Cost: 1.000047619047619
Cost: 1.000047619047619

它以0.1到100的学习率输出相同的内容

0 个答案:

没有答案