我正试图从头开始实现一个2层神经网络。但是有些不对劲。经过一些迭代后,我的损失变为nan
。
'''
We are implementing a two layer neural network.
'''
import numpy as np
x,y = np.random.rand(64,1000),np.random.randn(64,10)
w1,w2 = np.random.rand(1000,100),np.random.rand(100,10)
learning_rate = 1e-4
x -= np.mean(x,axis=0) #Normalizing the Training Data Set
for t in range(2000):
h = np.maximum(0,x.dot(w1)) # Applying Relu Non linearity
ypred = h.dot(w2) #Output of Hidden layer
loss = np.square(ypred - y).sum()
print('Step',t,'\tLoss:- ',loss)
#Gradient Descent
grad_ypred = 2.0 * (ypred - y)
gradw2 = (h.transpose()).dot(grad_ypred)
grad_h = grad_ypred.dot(w2.transpose())
gradw1 = (x.transpose()).dot(grad_h*h*(1-h))
w1 -= learning_rate*gradw1
w2 -= learning_rate*gradw2
我还使用Softmax分类器和多类SVM损失实现了线性回归。出现同样的问题。请告诉我如何解决这个问题。
输出:
D:\Study Material\Python 3 Tutorial\PythonScripts\Machine Learning>python TwoLayerNeuralNet.py
Step 0 Loss:- 19436393.79233052
Step 1 Loss:- 236820315509427.38
Step 2 Loss:- 1.3887002186558748e+47
Step 3 Loss:- 1.868219503527502e+189
Step 4 Loss:- inf
TwoLayerNeuralNet.py:23: RuntimeWarning: invalid value encountered in multiply
gradw1 = (x.transpose()).dot(grad_h*h*(1-h))
TwoLayerNeuralNet.py:12: RuntimeWarning: invalid value encountered in maximum
h = np.maximum(0,x.dot(w1)) # Applying Relu Non linearity
Step 5 Loss:- nan
Step 6 Loss:- nan
Step 7 Loss:- nan
Step 8 Loss:- nan
Step 9 Loss:- nan
Step 10 Loss:- nan
Step 11 Loss:- nan
Step 12 Loss:- nan
Step 13 Loss:- nan
Step 14 Loss:- nan
Step 15 Loss:- nan
Step 16 Loss:- nan
Step 17 Loss:- nan
Step 18 Loss:- nan
Step 19 Loss:- nan
Step 20 Loss:- nan
答案 0 :(得分:0)
因为你的损失太高了 试试这个
loss = np.square(ypred - y).mean()
如果仍然无法工作,请尝试将学习率降低到1e-8
并观察损失是上升还是下降,如果损失正在减少那么好,如果损失增加是一个坏的迹象,您可能需要考虑使用更好的数据集并检查权重更新。