Question

问题和帮助

我正在尝试微调转换器模型以回答问题，并且正在使用BCEWithLogitsLoss函数。但是，当我尝试计算损失时，出现此错误：

RuntimeError：Long的CUDAType不支持_th_exp_out

我输入的形状和类型为[16, 2]和Long的矩阵
抱歉，我不知道要解决这个问题。我尝试使用另一个dtype（int32，float32，double），该方法不起作用。

代码如下：

def loss_fn(self, preds, labels):
    return torch.nn.BCEWithLogitsLoss()(preds, labels)

def train_fn(self, dataloader, model, optimizer, device):
    # Some other stuff here
    pred = model(
            token_ids = token_ids,
            attention_mask = attention_mask,
            token_type_ids = token_type_ids)
    start_scores = torch.argmax(pred[0], dim=1)
    end_scores = torch.argmax(pred[1], dim=1)
        
    pred = torch.tensor(list(zip(start_scores, end_scores)))
    pred = pred.to(device, dtype=torch.long)

    batch_loss = self.loss_fn(pred, label)

Answer 1

在我的情况下，将pred和label都传递为float即可。

RuntimeError：长期CUDAType不支持_th_exp_out

问题和帮助

1 个答案: