我的前瞻猜测功能运行良好,但是当我进行训练时,反向传播无法正确调整权重值,导致所有输出收敛到〜.5。
我浏览了一些不同的编码教程视频,以及大量的理论和数学读物,但这似乎对我的代码没有帮助。
Class NeuralNetwork:
def __init__(self,IN,HL,HN,ON,LR):
"""
Creates a neural network with IN number of input nodes,
HL number of hidden layers, HN number of hidden nodes,
ON number of output nodes, and a learning rate of LR
"""
self.weights = []
self.layers = HL + 2
self.sets = HL + 1
self.learningRate = LR
if HL == 0:
inputWeights = np.random.rand(ON,IN)
inputWeights = np.asmatrix(inputWegihts)
self.weights.append(inputWEights)
else:
inputWeights = np.random.rand(HN,IN)
inputWeights = np.asmatrix(inputWeights)
self.weights.append(inputWeights)
for x in range(self.layers-3):
self.bias.append(random.randon())
hiddenWeights = np.random.rand(HN,HN)
hiddenWeights = np.asmatrix(hiddenWeights)
self.weights.append(hiddenWeights)
hiddenWeights = np.random.rand(ON,HN)
hiddenWeights = np.asmatrix(hiddenWeights)
self.weights.append(hiddenWeights)
@staticmethod
def sigmoid(x):
y = (1/(1+(np.exp(-x))))
return y
@staticmethod
def dsigmoid(x):
y = x
y_1 = (1-x)
z = np.multiply(y,y_1)
return z
def train(self,inputs,targets):
a = []
inputs = np.mat(inputs).T
targets = np.mat(targets).T
a.append(inputs)
z = []
for x in range(self.layers-1):
z.append(np.dot(self.weights[x],a[x]))
a.append(NeuralNetwork.sigmoid(z[x]))
cost = np.square(np.subtract(a[-1],targets))
error = []
deltaW = []
deltaB = []
"""
This is the start of back propagation.
"""
error.append(np.subtract(a[-1],targets))
err = np.multiply(NeuralNetwork.dsigmoid(a[-1]),a[self.layers-2])
deltaW.append(np.asmatrix(np.multiply(error[-1].T,err.T)))
deltaB.append(error[-1].T)
for x in reversed(range(1,self.sets)):
error_temp = np.multiply(NeuralNetwork.dsigmoid(a[x+1]),self.weights[x])
error.insert(0,np.multiply(error_temp,error[-1]))
err = np.multiply(NeuralNetwork.dsigmoid(a[x]),a[x-1])
deltaW.insert(0,np.multiply(err,error[0]).T)
deltaB.insert(0,error[0])
"""
Tweak the weights and biases of the neural network
"""
for x in range(self.sets):
self.weights[x] -= self.learningRate*deltaW[x]
brain = NeuralNetwork(2,1,2,1,.5)
inputs = [[1,0],[0,1],[0,0],[1,1]]
target = [[1],[1],[0],[0]]
for x in range(50000):
for x in range(5):
r = random.randint(0,3)
brain.train(inputs[r],target[r])
for x in inputs:
brain.guess(x)
答案 0 :(得分:0)
到底谁说[NextLayer x CurrentLayer]形式的体重矩阵更容易?这太可怕了!算了!
使输入矩阵具有与现在相同的方式。生成您的体重矩阵和偏差为:
w0 = np.random.normal(0., 1., size=(IN, HN))
w1 = np.random.normal(0., 1., size=(HN, ON))
b0 = np.zeros(HN)
b1 = np.zeros(ON)
像这样,您的前向传播要简单得多:
z, a = [], [inputs]
for _ in range(self.layer):
z.append(a[-1] @ w[i] + b[i])
a.append(self.sigmoid(z[-1])
输入*重量+偏差->应用激活功能。没有无意义的转置就不会做任何运算。除此之外,如果您将S形更改为其他任何值,那么在补偿不正确的S形导数时,您将最终回到现在的位置。
@staticmethod
def dsigmoid(x):
return NeuralNetwork.sigmoid(x) * (1. - NeuralNetwork.sigmoid(x))
是正确的导数。在反向传播中,当您计算出da [i] / dz [i](在您的情况下被证明是dsigmoid)时,参数不是激活。
z[i] = a[i] * w[i] + b[i]
a[i+1]= f(z[i])
计算a [i + 1]相对于z [i]的偏导数,肯定不是f'(a [i + 1]),而是f'(z [i])。以后您会省去很多麻烦。老实说,我不太了解您的反向传播算法。
当您获得网络的猜测时,将其称为“猜测”。您将错误计算为:
cost = .5 * (guess - target) ** 2
至少对于一个培训示例来说,但是好像您在做SGD。现在,第一步是将增量计算为:
delta = (guess - target)
,然后向后遍历各层,对于每次迭代,您将计算dw [i],db [i]和即将到来的(上一层)的新增量:
delta = delta * dsigmoid(z[i])
dw[i] = a[i-1].T @ delta
db[i] = np.sum(delta, axis=0)
delta = delta @ w[i]
基本上就是这些。除此之外,请考虑阅读激活功能。乙状结肠是一种激活功能,尤其会受到梯度消失问题的影响。这意味着,乙状结肠的导数是如此之小,以至于您的体重根本不变。看一下ReLu或Leaky-ReLu,它有很多变体,但是还有Google的新Swish激活功能。对于conv图层,使用Swish在CIFAR-10上我获得了相当不错的结果。