我是机器学习和尝试理解它(自学)的新手。因此,我拿起了一本书(如果有兴趣的话:https://www.amazon.com/Neural-Networks-Unity-Programming-Windows/dp/1484236726)并开始阅读第一章。在阅读时,有些事情我不了解,所以我去了网上研究。 但是,即使经过如此多的阅读和研究,我仍然无法理解一些要点:
神经网络完整代码:
import numpy as np
#activation function (sigmoid , maps value between 0 and 1)
def sigmoid(x):
return 1/(1+np.exp(-x))
def derivative(x):
return x*(1-x)
#initialize input (4 training data (row), 3 features (col))
X = np.array([[0,0,1],[0,1,1],[1,0,1],[1,1,1]])
#initialize output for training data (4 training data (rows), 1 output for each (col))
Y = np.array([[0],[1],[1],[0]])
np.random.seed(1)
#synapses
syn0 = 2* np.random.random((3,4)) - 1
syn1 = 2* np.random.random((4,1)) - 1
for iter in range(60000):
#layers
l0 = X
l1 = sigmoid(np.dot(l0,syn0))
l2 = sigmoid(np.dot(l1,syn1))
#error
l2_error = Y - l2
if(iter % 10000 == 0): #only print error every 10000 steps to save time and limit the amount of output
print("Error L2: " + str (np.mean(np.abs(l2_error))))
#what is this part doing?
l2_delta = l2_error * derivative(l2)
l1_error = l2_delta.dot(syn1.T)
l1_delta = l1_error * derivative(l1)
if(iter % 10000 == 0): #only print error every 10000 steps to save time and limit the amount of output
print("Error L1: " + str (np.mean(np.abs(l1_error))))
#update weights
syn1 = syn1 + l1.T.dot(l2_delta) // derative with respect to cost function
syn0 = syn2 + l0.T.dot(l1_delta)
print(l2)
谢谢!
答案 0 :(得分:0)
通常,逐层计算(因此使用上面的符号l1和l2)只是获得矢量$ x \ in \ mathbb {R} ^ n $和权重矢量在相同维度上的点积,然后应用每个组件上的S形函数。
梯度下降。 ---想象一下,在二维中说$ f(x)= x ^ 2 $的图假设,我们不知道如何获得最小值。梯度下降将基本评估$ f'(x)$的各个点,并检查$ f'(x)$是否接近零