在本地计算机上运行PyTorch模型时出现运行时错误

时间:2020-06-16 04:01:46

标签: python deep-learning pytorch

我正在本地运行此笔记本

https://github.com/udacity/deep-learning-v2-pytorch/blob/master/sentiment-rnn/Sentiment_RNN_Solution.ipynb 直到我开始训练模型,一切都正常了

# training params

epochs = 4 # 3-4 is approx where I noticed the validation loss stop decreasing

counter = 0
print_every = 100
clip=5 # gradient clipping

# move model to GPU, if available
if(train_on_gpu):
    net.cuda()

net.train()
# train for some number of epochs
for e in range(epochs):
    # initialize hidden state
    h = net.init_hidden(batch_size)

    # batch loop
    for inputs, labels in train_loader:
        counter += 1

        if(train_on_gpu):
            inputs, labels = inputs.cuda(), labels.cuda()

        # Creating new variables for the hidden state, otherwise
        # we'd backprop through the entire training history
        h = tuple([each.data for each in h])

        # zero accumulated gradients
        net.zero_grad()

        # get the output from the model
        output, h = net(inputs, h)

        # calculate the loss and perform backprop
        loss = criterion(output.squeeze(), labels.float())
        loss.backward()
        # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs.
        nn.utils.clip_grad_norm_(net.parameters(), clip)
        optimizer.step()

        # loss stats
        if counter % print_every == 0:
            # Get validation loss
            val_h = net.init_hidden(batch_size)
            val_losses = []
            net.eval()
            for inputs, labels in valid_loader:

                # Creating new variables for the hidden state, otherwise
                # we'd backprop through the entire training history
                val_h = tuple([each.data for each in val_h])

                if(train_on_gpu):
                    inputs, labels = inputs.cuda(), labels.cuda()

                output, val_h = net(inputs, val_h)
                val_loss = criterion(output.squeeze(), labels.float())

                val_losses.append(val_loss.item())

            net.train()
            print("Epoch: {}/{}...".format(e+1, epochs),
                  "Step: {}...".format(counter),
                  "Loss: {:.6f}...".format(loss.item()),
                  "Val Loss: {:.6f}".format(np.mean(val_losses)))

发生的错误:

RuntimeError                              Traceback (most recent call last)
<ipython-input-31-9f7dea11cb7b> in <module>
     32 
     33         # get the output from the model
---> 34         output, h = net(inputs, h)
     35 
     36         # calculate the loss and perform backprop

c:\users\asus\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-16-b99cefc1dc61> in forward(self, x, hidden)
     36 
     37         # embeddings and lstm_out
---> 38         embeds = self.embedding(x)
     39         lstm_out, hidden = self.lstm(embeds, hidden)
     40 

c:\users\asus\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

c:\users\asus\.conda\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py in forward(self, input)
    110 
    111     def forward(self, input):
--> 112         return F.embedding(
    113             input, self.weight, self.padding_idx, self.max_norm,
    114             self.norm_type, self.scale_grad_by_freq, self.sparse)

c:\users\asus\.conda\envs\pytorch\lib\site-packages\torch\nn\functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1722         # remove once script supports set_grad_enabled
   1723         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1724     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1725 
   1726 

RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)

我不明白为什么会这样。我试图在线找到解决方案。这表示我需要将模型和数据传输到GPU。我做到了,但问题仍然存在。

1 个答案:

答案 0 :(得分:0)

您正在尝试嵌入inputs,该整数以整数(torch.int的形式给出。只能嵌入整数(torch.long,因为它们必须是不能浮点数的索引。

inputs需要转换为torch.long

inputs = inputs.to(torch.long)

您似乎已将转换删除为long,因为在笔记本中它是在模型中完成的:

# embeddings and lstm_out
x = x.long()
embeds = self.embedding(x)

在堆栈跟踪中,缺少行x = x.long()(与使用.to(torch.long)相同)。