无法使用.backwards()在Pytorch模块中反向传播;重量未更新

时间:2019-07-13 13:25:48

标签: python pytorch

我愿意创建一个网络来估计下一个信号样本。我陈述了一个简单的罪过信号。但是,当我运行代码时,我得到了噪音作为输出。然后检查图层权重并确定它们没有更新。我在这里找不到错误。

class Model(nn.Module):
def __init__(self,in_dim,hidden_dim,num_classes):
    super(Model, self).__init__()
    self.layer1 = nn.Linear(in_dim,hidden_dim)
    self.layer2 = nn.Linear(hidden_dim,hidden_dim)
    self.layer3 = nn.Linear(hidden_dim,num_classes)
    self.relu = nn.ReLU()

def forward(self,x):
    a = self.relu(self.layer1(x))
    a = self.relu(self.layer2(a))
    return self.relu(self.layer3(a))  

火车:

def train(epoch,L,depth):
    criteria = nn.MSELoss()
    learning_rate = 1e-3
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

    t = np.linspace(0,2,L+2)       
    fs = L+2

    trn_loss = list()

   for f in range(0,epoch):
       phase = f/np.pi      
       x = np.sin(2*np.pi*t*fs+phase)   
       x = torch.from_numpy(x).detach().float()

       optimizer.zero_grad()

       x_hat = model(x[:-2])

       currentCost = criteria(x_hat,x[-2])
       trn_loss.append(currentCost.item())
       print(model.layer1.weight.data.clone())
       currentCost.backward()
       optimizer.step()
       print(model.layer1.weight.data.clone())
       sys.exit('DEBUG')

输出:

tensor([[-0.1715, -0.1696,  0.0424,  ...,  0.0154,  0.1450, -0.0544],
    [ 0.0368,  0.1427, -0.1419,  ...,  0.0966,  0.0298, -0.0659],
    [-0.1641, -0.1551,  0.0570,  ..., -0.0227, -0.1426, -0.0648],
    ...,
    [-0.0684, -0.1707, -0.0711,  ...,  0.0788,  0.1386,  0.1546],
    [ 0.1401, -0.0922, -0.0104,  ..., -0.0490,  0.0404,  0.1038],
    [-0.0604, -0.0517,  0.0715,  ..., -0.1200,  0.0014,  0.0215]])
tensor([[-0.1715, -0.1696,  0.0424,  ...,  0.0154,  0.1450, -0.0544],
    [ 0.0368,  0.1427, -0.1419,  ...,  0.0966,  0.0298, -0.0659],
    [-0.1641, -0.1551,  0.0570,  ..., -0.0227, -0.1426, -0.0648],
    ...,
    [-0.0684, -0.1707, -0.0711,  ...,  0.0788,  0.1386,  0.1546],
    [ 0.1401, -0.0922, -0.0104,  ..., -0.0490,  0.0404,  0.1038],
    [-0.0604, -0.0517,  0.0715,  ..., -0.1200,  0.0014,  0.0215]])

1 个答案:

答案 0 :(得分:2)

您在forward呼叫中的最后一层使用ReLU激活。这会将网络的输出限制在[0, +inf)范围内。

请注意,您的目标位于[-1, 1]范围内,因此网络无法输出值的一半(负数)(对于正数,它必须处理{{1} }到+inf空间的可能值)。

您应将[0, 1]中的return self.relu(self.layer3(a))更改为return self.layer3(a)

更好的是,为了帮助您的网络适应forward的范围,请使用[-1, 1]激活,因此torch.tanh应该是最好的。