我已经实现了以下神经网络来解决Python中的XOR问题。我的神经网络由3个神经元的输入层,2个神经元的隐藏层和1个神经元的输出层组成。我使用Sigmoid函数作为隐藏层和输出层的激活函数:
import numpy as np
x = np.array([[0,0,1], [0,1,1],[1,0,1],[1,1,1]])
y = np.array([[0,1,1,0]]).T
np.random.seed(1)
weights1 = np.random.random((3,2)) - 1
weights2 = np.random.random((2,1)) - 1
def nonlin(x,deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
for iter in xrange(10000):
z2 = np.dot(x,weights1)
a2 = nonlin(z2)
z3 = np.dot(a2,weights2)
a3 = nonlin(z3)
error = y- a3
delta3 = error * nonlin(z3,deriv=True)
l1error = delta3.dot(weights2.T)
delta2 = l1error *nonlin(z2, deriv=True)
weights2 += np.dot(a2.T, delta3)
weights1 += np.dot(x.T,delta2)
print(a3)
backpropogation似乎是正确的,但我一直收到这个错误,所有的值都变成了' nan',OUTPUT:
RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))
RuntimeWarning: overflow encountered in multiply
return x*(1-x)
[[ nan]
[ nan]
[ nan]
[ nan]]
你能帮我解决这个问题吗?谢谢。
答案 0 :(得分:0)
重量爆炸存在一些问题:
weight1 = [[ -6.25293101e+194 -2.22527234e+000]
[ 2.24755436e+193 -2.44789058e+000]
[ -2.40600808e+194 -1.62490517e+000]]
这是因为当您计算反向传播的增量误差时,您使用了点积的输出而不是激活函数的输出。
更正您的代码:
delta3 = error * nonlin(a3,deriv=True)
l1error = delta3.dot(weights2.T)
delta2 = l1error *nonlin(a2, deriv=True)