Question

这可能是一个愚蠢的问题，但我有些困惑。我试图用Python编写一个简单的前馈神经网络。我的输入，权重和输出层的声明如下：

self.inp = np.zeros(21)
self.weights1 = np.random.rand(self.inp.shape[0],15) 
self.weights2 = np.random.rand(15, 15)
self.layer1 = self.sigmoid(np.dot(self.inp, self.weights1))
self.output = self.sigmoid(np.dot(self.layer1, self.weights2))

现在我正在尝试向后传播，但是矢量的大小不合适。这是我的反向传播功能：

def backpropagate(self, dice, board):
    y = argmax(dice, self.moves)
    d_weights2 = np.dot(self.layer1.T, (2*(y - self.output) * self.sigmoidDerivative(self.output)))
    d_weights1 = np.dot(self.inp.T,  (np.dot(2*(y - self.output) * self.sigmoidDerivative(self.output), self.weights2.T) * self.sigmoidDerivative(self.layer1)))

    self.weights1 += d_weights1
    self.weights2 += d_weights2

在计算d_weights1时出现错误。错误是

ValueError: shapes (21,) and (15,) not aligned: 21 (dim 0) != 15 (dim 0)

如何使向量合适？

谢谢！

编辑：

根据要求，这是整个课程：

import numpy as np
from TestValues import argmax, testfunctions, zero

class AI:

    def __init__(self):
        self.moves = []
        self.inp = np.zeros(21)
        self.weights1 = np.random.rand(self.inp.shape[0],21) 
        self.weights2 = np.random.rand(21, 15)
        self.output = np.zeros(15)

    def getPlacement(self, dice, board):
        self.feedforward(dice, board)
        self.backpropagate(dice, board)
        result = self.output
        for x in self.moves:
            result[x] = -1.
        move = np.argmax(result)
        self.moves.append(move)
        return move

    def feedforward(self, dice, board):
        i = 0
        for x in dice:
            self.inp[i] = x
            i += 1
        for x in board:
            self.inp[i] = x
            i += 1

        self.layer1 = self.sigmoid(np.dot(self.inp, self.weights1))
        self.output = self.sigmoid(np.dot(self.layer1, self.weights2))

    def backpropagate(self, dice, board):

        y = argmax(dice, self.moves)

        d_weights2 = np.dot(self.layer1.T, np.dot(2*(y - self.output), self.sigmoidDerivative(self.output)))
        d_weights1 = np.dot(self.inp.T,  (np.dot(2*(y - self.output) * self.sigmoidDerivative(self.output), self.weights2.T) * self.sigmoidDerivative(self.layer1)))

        print(self.weights2.shape)

        self.weights1 += d_weights1
        self.weights2 += d_weights2

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    def sigmoidDerivative(self, x):
        return self.sigmoid(x) * (1 - self.sigmoid(x))

Answer 1

似乎问题在于初始化输入的方式。您正在生成形状为(21,)而不是(1, 21)的数组。如果您打算一次向后传播许多训练示例，那么在某些时候这可能会变得很明显。而且，尝试调试这些结果矩阵的形状通常是有益的。例如，我的d_weights2是单个标量。而且，如果您不熟悉矩阵代数，则对理解点积和应该产生的结果很有帮助。

因此，简单地说，只需像这样初始化即可：

inp = np.zeros((1, 21))

这为我创造了明智的形状。

即使不是CodeReview，我也不得不说一些关于您的代码。不要重复自己。反向传播时，您可以先在层上计算误差，然后在两次更新中都使用该误差。 error = 2*(output - y) * d_logistic(output)如果您打算将网络扩展为具有任意大小，而不仅仅是两层，那么这也将使事情有所简化。

还有一件事，您的函数sigmoid和sigmoidDerivative在类中没有用。考虑使它们成为纯函数，而不是类方法。

向量的反向传播和形状不适合

1 个答案: