如何在不使用nn.RNN的情况下构建RNN

时间:2018-04-23 18:27:26

标签: python-3.x deep-learning recurrent-neural-network pytorch rnn

我需要使用以下规范构建RNN(不使用nn.RNN):

  1. 它应该有一组权重[

    • 这是一个特征RNN。

    • 它应该有1个隐藏层

    • Wxh(从输入图层到隐藏图层)

    • Whh(来自隐藏层中的循环连接)

    • W ho(从隐藏层到输出层)

    • 我需要使用Tanh隐藏图层

    • 我需要将softmax用于输出层。

  2. 我已经实现了代码。我使用CrossEntropyLoss()作为损失函数。 这给了我错误

    RuntimeError                              Traceback (most recent call last)
    <ipython-input-33-94b42540bc4f> in <module>()
         25         print("target ",target_tensor[timestep])
         26 
    ---> 27         loss += criterion(output,target_tensor[timestep].view(1,n_vocab))
         28 
         29     loss.backward()
    
    /opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
        323         for hook in self._forward_pre_hooks.values():
        324             hook(self, input)
    --> 325         result = self.forward(*input, **kwargs)
        326         for hook in self._forward_hooks.values():
        327             hook_result = hook(self, input, result)
    
    /opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
        145         _assert_no_grad(target)
        146         return F.nll_loss(input, target, self.weight, self.size_average,
    --> 147                           self.ignore_index, self.reduce)
        148 
        149 
    
    /opt/anaconda/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce)
       1047         weight = Variable(weight)
       1048     if dim == 2:
    -> 1049         return torch._C._nn.nll_loss(input, target, weight, size_average, ignore_index, reduce)
       1050     elif dim == 4:
       1051         return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)
    
    RuntimeError: multi-target not supported at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THNN/generic/ClassNLLCriterion.c:22
    
    
    ​
    

    以下是我的模型代码:

    class CharRNN(torch.nn.Module):
    
        def __init__(self,input_size,hidden_size,output_size, n_layers = 1):
    
            super(CharRNN, self).__init__()
            self.input_size  = input_size
            self.hidden_size = hidden_size
            self.n_layers    = 1
    
            self.x2h_i = torch.nn.Linear(input_size + hidden_size, hidden_size)
            self.x2h_f = torch.nn.Linear(input_size + hidden_size, hidden_size)
            self.x2h_o = torch.nn.Linear(input_size + hidden_size, hidden_size)
            self.x2h_q = torch.nn.Linear(input_size + hidden_size, hidden_size)
            self.h2o   = torch.nn.Linear(hidden_size, output_size)
            self.sigmoid = torch.nn.Sigmoid()
            self.softmax = torch.nn.Softmax()
            self.tanh    = torch.nn.Tanh()
    
        def forward(self, input, h_t, c_t):
    
            combined_input = torch.cat((input,h_t),1)
    
            i_t = self.sigmoid(self.x2h_i(combined_input))
            f_t = self.sigmoid(self.x2h_f(combined_input))
            o_t = self.sigmoid(self.x2h_o(combined_input))
            q_t = self.tanh(self.x2h_q(combined_input))
    
            c_t_next = f_t*c_t + i_t*q_t
            h_t_next = o_t*self.tanh(c_t_next)
    
            output = self.softmax(h_t_next)
            return output, h_t, c_t
    
        def initHidden(self):
            return torch.autograd.Variable(torch.zeros(1, self.hidden_size))
    
        def weights_init(self,model):
    
            classname = model.__class__.__name__
            if classname.find('Linear') != -1:
                model.weight.data.normal_(0.0, 0.02)
                model.bias.data.fill_(0)
    

    `

    这是训练模型的代码:

    `
    input_tensor  = torch.autograd.Variable(torch.zeros(seq_length,n_vocab))
    target_tensor = torch.autograd.Variable(torch.zeros(seq_length,n_vocab))
    
    model   = CharRNN(input_size = n_vocab, hidden_size = hidden_size, output_size = output_size)
    model.apply(model.weights_init)
    
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)
    
    for i in range(n_epochs):
        print("Iteration", i)
    
        start_idx    = np.random.randint(0, n_chars-seq_length-1)
        train_data   = raw_text[start_idx:start_idx + seq_length + 1]
    
        input_tensor = torch.autograd.Variable(seq2tensor(train_data[:-1],n_vocab), requires_grad = True)
        target_tensor= torch.autograd.Variable(seq2tensor(train_data[1:],n_vocab), requires_grad = False).long()
    
        loss = 0
    
        h_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
        c_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
    
        for timestep in range(seq_length):
    
            output, h_t, c_t = model(input_tensor[timestep].view(1,n_vocab), h_t, c_t)
    
            loss += criterion(output,target_tensor[timestep].view(1,n_vocab))
    
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
    
        x_t = input_tensor[0].view(1,n_vocab)
        h_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
        c_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
    
        gen_seq = []
    
        for timestep in range(100):
            output, h_t, c_t = model(x_t, h_t, c_t)
            ix = np.random.choice(range(n_vocab), p=output.data.numpy().ravel())
            x_t = torch.autograd.Variable(torch.zeros(1,n_vocab))
            x_t[0,ix] = 1
            gen_seq.append(idx2char[ix])
    
        txt = ''.join(gen_seq)
        print ('----------------------')
        print (txt)
        print ('----------------------')
    

    你能帮帮我吗?

    提前致谢。

1 个答案:

答案 0 :(得分:1)

问题出在你的目标张量上。它的形状为1, n_classes,是一个2D张量,但CrossEntropyLoss需要一维张量。

或者用其他术语表示,您提供的是单热编码目标张量,但损失函数期望从0n_classes-1的类号。将损失计算更改为 -

one_hot_target = target_tensor[timestep].view(1,n_vocab)
_, class_target = torch.max(one_hot_target, dim=1)
loss += criterion(output, class_target)