训练自定义LSTM模型以在Pytorch中进行句子分类

时间:2019-11-13 00:31:41

标签: deep-learning pytorch lstm

我在Pytorch中有一个自定义LSTM模型,可以在其中传递word2vec嵌入矩阵并获得预测输出。输入是句子,目标是int(0或1)的列表

hidden_size = 32  
num_layers = 1
num_classes = 2

class customModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(customModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.bilstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
        self.fcl = nn.Linear(hidden_size*2, num_classes)

    def forward(self, x):
        # Set initial hidden and cell states 
        h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)
        c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(device)

        # Forward propagate LSTM
        out, hidden = self.bilstm(x, (h0, c0)) 
        fw_bilstm = out[-1, :, :self.hidden_size]
        bk_bilstm = out[0, :, :self.hidden_size]
        concat_fw_bw = torch.cat((fw_bilstm, bk_bilstm), dim = 1)
        fc = self.fcl(concat_fw_bw)
        x = F.softmax(F.relu(fc))
        return x

我将输入转换为word2vec嵌入矩阵并创建一个火炬张量

embedding_torch = torch.FloatTensor(embedding)

还将目标转换为火炬张量

target_seq = torch.FloatTensor(target.astype(np.float32))

我的training code如下所示

# Define hyperparameters
n_epochs = 100
lr=0.01

# Define Loss, Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)

# Training Run
for epoch in range(1, n_epochs + 1):
    optimizer.zero_grad() 
    output = model(embedding_torch)
    loss = criterion(output, target_seq)
    loss.backward() # Does backpropagation and calculates gradients
    optimizer.step() # Updates the weights accordingly

    if epoch%10 == 0:
        print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
        print("Loss: {:.4f}".format(loss.item()))

哪个抛出了我错误:

  

ValueError:预期输入的batch_size(1)匹配目标batch_size(67349)

之所以会这样,是因为output行中的output = model(embedding_torch)tensor([[0.5000, 0.5000]], grad_fn=<SoftmaxBackward>)的张量,而target_seq的大小是tensor([0., 0., 1., ..., 1., 1., 0.]),大小为67349。我知道这是一个大小不匹配,但不确定如何解决。

0 个答案:

没有答案