RNN不学习

时间:2020-11-10 01:38:47

标签: python nlp pytorch recurrent-neural-network

无论我训练多长时间,准确性仍然很低,并且损失没有减少。

input_dim = len(vocab)

embedding_dim:我尝试了100〜500,没有太大改善

hidden_​​dim = 512

output_dim = 6

forward()的每个“输入”都类似于下面的张量,每个元素都是词汇中单词的索引。

tensor([[11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831, 11831,
         11831, 11831, 11831, 11831, 11831, 11831,  5010,  3771,  7949,  2125,
          8461,  1170,  5010,  3771,  7949,  2125,     0,  4937,  4939,  4281]],
       device='cuda:0')

模型有什么问题吗?


    class RNN(nn.Module):
            def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim): 
                super(RNN, self).__init__()
                
            self.hidden_dim = hidden_dim
    
            
            self.embedding = nn.Embedding(input_dim, embedding_dim)
            self.rnn = nn.RNN(embedding_dim, hidden_dim, batch_first=True)
            self.W = nn.Linear(hidden_dim, output_dim)
        
            
            self.softmax = nn.LogSoftmax(dim=1)
            self.loss = nn.NLLLoss()
    
        def compute_Loss(self, predicted_vector, gold_label):
            return self.loss(predicted_vector, gold_label)  
    
        def hidden0(self, inputs):
            batch_dim = inputs.size(0)
            h0 = torch.zeros(1, batch_dim, self.hidden_dim) 
            return h0
        
        def forward(self, inputs):
            
            hidden = self.hidden0(inputs).to(get_device())
            
        
            inputs = inputs.reshape(-1)
            #print(inputs)
        
            embed = self.embedding(inputs).unsqueeze(0)
        
            
            out, hn = self.rnn(embed, hidden)
            output = out.view(-1, self.hidden_dim)
            
            z = self.W(output)
            
            predicted_vector = self.softmax(z) 
        
            predicted_vector = predicted_vector[-1].view(1,6)
        
            
            return predicted_vector
    

0 个答案:

没有答案