为什么我不能在此网络和约束条件下学习XOR功能?

时间:2019-03-06 09:52:47

标签: python machine-learning neural-network deep-learning pytorch

假设我有以下限制条件和人际关系网:

  1. 架构是固定的(see this image)(请注意,没有偏见)
  2. 隐藏层的激活功能是ReLU
  3. 输出层没有激活功能(应该只返回接收到的输入的总和)。

我尝试使用各种初始化方案和不同的数据集在pytorch中实现此功能,但是我失败了(代码在底部)。

我的问题是:

  1. 我的NN训练过程有什么问题吗?
  2. 这是一个可行的问题吗?如果是,怎么办?
  3. 如果这是可行的,我们仍然可以通过将权重约束在集合{-1,0,1}中来实现这一目标

代码:

structure(list(X__1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
), name = c("ABW 0", "ABW 1", "ABW 3", "ABW 4", "ABW 5", "ABW 6", 
"DBW 0", "DBW 1", "DBW 3", "DBW 4", "DBW 5", "DBW 6")), row.names = c(NA, 
-12L), class = c("tbl_df", "tbl", "data.frame"))

损失没有改善。几个时期后,它就会卡在某个值中(我不确定如何使它可重现,因为每次都获得不同的值)

import torch import torch.nn as nn import torch.optim as optim import torch.utils.data as data_utils import numpy as np class Network(nn.Module): def __init__(self): super(Network, self).__init__() self.fc1 = nn.Linear(2,2,bias=False) self.fc2 = nn.Linear(2,1, bias=False) self.rl = nn.ReLU() def forward(self, x): x = self.fc1(x) x = self.rl(x) x = self.fc2(x) return x #create an XOR data set to train rng = np.random.RandomState(0) X = rng.randn(200, 2) y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0).astype('int32') # test data set X_test = np.array([[0,0],[0,1], [1,0], [1,1]]) train = data_utils.TensorDataset(torch.from_numpy(X).float(), \ torch.from_numpy(y).float()) train_loader = data_utils.DataLoader(train, batch_size=50, shuffle=True) test = torch.from_numpy(X_test).float() # training the network num_epoch = 10000 net = Network() net.fc1.weight.data.clamp_(min=-1, max=1) net.fc2.weight.data.clamp_(min=-1, max=1) # define loss and optimizer criterion = nn.MSELoss() optimizer = optim.Adam(net.parameters()) for epoch in range(num_epoch): running_loss = 0 # loss per epoch for (X, y)in train_loader: # make the grads zero optimizer.zero_grad() # forward propagate out = net(X) # calculate loss and update loss = criterion(out, y) loss.backward() optimizer.step() running_loss += loss.data if epoch%500== 0: print("Epoch: {0} Loss: {1}".format(epoch, running_loss)) 返回的一组预测与XOR输出非常接近。

0 个答案:

没有答案