我正在尝试用python编写一个非常基本的神经网络,其中3个输入节点的值为0或1,而单个输出节点的值为0或1。输出应该几乎等于第二个输入,但是经过训练后,权重太高了,网络几乎总是猜测为1。
我正在将python 3.7与numpy和scipy一起使用。我尝试过更改训练集,新实例和随机种子
import numpy as np
from scipy.special import expit as ex
rand.seed(10)
training_set=[[0,1,0],[1,0,1],[0,0,0],[1,1,1]] #The training sets and their outputs
training_outputs=[0,1,0,1]
weightlst=[rand.uniform(-1,1),rand.uniform(-1,1),rand.uniform(-1,1)] #Weights are randomly set with a value between -1 and 1
print('Random weights\n'+str(weightlst))
def calcout(inputs,weights): #Calculate the expected output with given inputs and weights
output=0.5
for i in range(len(inputs)):
output=output+(inputs[i]*weights[i])
#print('\nmy output is ' + str(ex(output)))
return ex(output) #Return the output on a sigmoid curve between 0 and 1
def adj(expected_output,training_output,weights,inputs): #Adjust the weights based on the expected output, true (training) output and the weights
adjweights=[]
error=expected_output-training_output
for i in weights:
adjweights.append(i+(error*(expected_output*(1-expected_output))))
return adjweights
#Train the network, adjusting weights each time
training_iterations=10000
for k in range(training_iterations):
for l in range(len(training_set)):
expected=calcout(training_set[l],weightlst)
weightlst=adj(expected,training_outputs[l],weightlst,training_set[l])
new_instance=[1,0,0] #Calculate and return the expected output of a new instance
print('Adjusted weights\n'+str(weightlst))
print('\nExpected output of new instance = ' + str(calcout(new_instance,weightlst)))
预期输出将为0或非常接近它的值,但是无论我将new_instance设置为什么,输出仍为
Random weights
[-0.7312715117751976, 0.6948674738744653, 0.5275492379532281]
Adjusted weights
[1999.6135460307303, 2001.03968501638, 2000.8723667804588]
Expected output of new instance = 1.0
我的代码有什么问题?
答案 0 :(得分:4)
错误:
w_i = w_i + learning_rate * delta_w_i
的权重更新规则,(delta_w_i是相对于w_i的损耗梯度)delta_w_i = error*sample[i]
(输入矢量样本的第i个值)AND
,OR
之类的函数生成的数据。请注意,布尔值XOR
不可线性分离。import numpy as np
from scipy.special import expit as ex
rand.seed(10)
training_set=[[0,1,0],[1,0,1],[0,0,0],[1,1,1]] #The training sets and their outputs
training_outputs=[1,1,0,1] # Boolean OR of input vector
#training_outputs=[0,0,,1] # Boolean AND of input vector
weightlst=[rand.uniform(-1,1),rand.uniform(-1,1),rand.uniform(-1,1)] #Weights are randomly set with a value between -1 and 1
bias = rand.uniform(-1,1)
print('Random weights\n'+str(weightlst))
def calcout(inputs,weights, bias): #Calculate the expected output with given inputs and weights
output=bias
for i in range(len(inputs)):
output=output+(inputs[i]*weights[i])
#print('\nmy output is ' + str(ex(output)))
return ex(output) #Return the output on a sigmoid curve between 0 and 1
def adj(expected_output,training_output,weights,bias,inputs): #Adjust the weights based on the expected output, true (training) output and the weights
adjweights=[]
error=training_output-expected_output
lr = 0.1
for j, i in enumerate(weights):
adjweights.append(i+error*inputs[j]*lr)
adjbias = bias+error*lr
return adjweights, adjbias
#Train the network, adjusting weights each time
training_iterations=10000
for k in range(training_iterations):
for l in range(len(training_set)):
expected=calcout(training_set[l],weightlst, bias)
weightlst, bias =adj(expected,training_outputs[l],weightlst,bias,training_set[l])
new_instance=[1,0,0] #Calculate and return the expected output of a new instance
print('Adjusted weights\n'+str(weightlst))
print('\nExpected output of new instance = ' + str(calcout(new_instance,weightlst, bias)))
输出:
Random weights
[0.142805189379827, -0.14222189064977075, 0.15618260226894076]
Adjusted weights
[6.196759842119063, 11.71208191137411, 6.210137255008176]
Expected output of new instance = 0.6655563851223694
从上可以看到,对于输入[1,0,0]
,该模型预测了概率0.66
为1类(因为0.66> 0.5)。这是正确的,因为输出类是输入向量的OR。
为了学习/理解每个权重的更新方式,可以像上面这样编码,但是实际上所有操作都是矢量化的。检查link的向量化实现。