ReLU函数,在这种情况下,我使用了泄漏的ReLU给我错误的输出。但是使用sigmoid函数可以给我可接受的输出
这是我拥有的代码:
$scope.map.setOptions(mapOptions1);
import numpy as np
def relu(x):
return np.maximum(0.01 * x, x)
def relu_derivative(x):
x[x>0] = 1
x[x<0] = 0.01
return x
training_inputs = np.array([[1, 0],
[1, 1],
[0, 0],
[0, 1]])
training_outputs = np.array([[1, 0, 0, 1]]).T
weights = 2 * np.random.random((2, 1)) - 1
print('Weights before training: ')
print(weights)
for epochs in range(10000):
outputs = relu(np.dot(training_inputs, weights))
error = training_outputs - outputs
adjustment = error * relu_derivative(outputs)
weights += np.dot(training_inputs.T, adjustment)
print('Neuron Weights after training: ')
print(weights)
print('Outputs after training: ')
print(outputs)
Epochs = 10000
使用Outputs
= ReLU function
[0.01],[0.01],[0.01],[0.01],[0.01]
使用Outputs
= sigmoid function
Sigmoid函数给出的输出要比ReLU更好,我测试了高达100000的历元,并且ReLU函数的结果仍然相同。我的函数或代码有问题吗?
答案 0 :(得分:1)
首先在您的relu_derivative
函数中出现一个小错误。您不应该修改x
值,而是创建一个新数组:
def relu_derivative(x):
y = np.zeros_like(x)
y[x>0] = 1
y[x<0] = 0.01
return y
但是,由于它仍然无法正确学习如何解决XOR,因此无法回答您的问题。我认为1个隐藏单元不足以解决relu的问题。
我用PyTorch重写了相同的实验,这是代码:
import torch
class Model(torch.nn.Module):
def __init__(self, *args, **kwargs):
super().__init__()
self.hidden = torch.nn.Linear(2, kwargs['h'])
self.relu = torch.nn.LeakyReLU(0.1)
self.out = torch.nn.Linear(kwargs['h'], 1)
with torch.no_grad():
self.hidden.bias.zero_()
self.out.bias.zero_()
def forward(self, x):
z = self.hidden(x)
z = self.relu(z)
z = self.out(z)
return z
if __name__ == '__main__':
training_inputs = torch.Tensor([[1., 0.],
[1., 1.],
[0., 0.],
[0., 1.]])
training_outputs = torch.Tensor([1., 0., 0., 1.]).reshape(4, 1)
model = Model(h=2)
learning_rate = 0.01
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
print(*[i for i in model.parameters()], sep='\n')
for _ in range(1000):
pred = model(training_inputs)
loss = criterion(pred, training_outputs)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(pred, loss)
print(*[i for i in model.parameters()], sep='\n')
实际上只有1个隐藏单元,您似乎无法解决XOR,但是有2个隐藏单元有时(取决于初始化)有效。