我正在玩一个简单的神经网络,特别是本教程https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/,并且在进行反向传播时遇到了一些问题。当反向传播初始输入权重时,我的矩阵将不匹配。
我猜它是一个简单的线性代数问题。但是,我想知道教程中的编程语言是否让我困惑。或者我可能是更早出现的问题。
如果有人有任何想法我可能做错了,请告诉我!
我的代码
import numpy as np
inputM = np.matrix([
[0,1],
[1,0],
[1,1],
[0,0]
])
outputM = np.matrix([
[0],
[0],
[1],
[1]
])
neurons = 3
mu, sigma = 0, 0.1 # mean and standard deviation
weights = np.random.normal(mu, sigma, len(inputM.T) * neurons)
weightsMatrix = np.matrix(weights).reshape(3,2)
weights = np.matrix(weights)
#Forward
inputHidden = inputM * weightsMatrix.T
hiddenLayerLog = 1 / (1 + np.exp(inputHidden))
hiddenWeights = np.random.normal(mu, sigma, neurons)[np.newaxis, :]
sumOfHiddenLayer = np.sum(hiddenWeights + hiddenLayerLog, axis=1)
predictedOutput = 1 / (1 + np.exp(sumOfHiddenLayer))
residual = outputM - predictedOutput
logDerivative = 1 / (1 + np.exp(sumOfHiddenLayer)) * (1 - 1 / (1 + np.exp(sumOfHiddenLayer))).T
deltaOutputSum = logDerivative * residual
#Backward
deltaWeights = deltaOutputSum / hiddenLayerLog
newHiddenWeights = hiddenWeights - deltaWeights
deltaHiddenSum = (deltaOutputSum / hiddenWeights)
deltaHiddenSum = deltaHiddenSum.T * (1 / (1 + np.exp(inputHidden))) * (1 - 1 / (1 + np.exp(inputHidden))).T
newInputWeights = np.array(deltaHiddenSum) / np.array(inputM)