我整天都在解决这个问题。
torch.autograd.backward(loss_seq, grad_seq)
将出现错误。
输出:
Traceback (most recent call last):
File "train_vgg.py", line 272, in <module>
torch.autograd.backward(loss_seq, grad_seq)
File "/root/anaconda3/lib/python3.6/site-
packages/torch/autograd/__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: invalid gradient at index 0 - expected shape [] but got [1]
输入:
loss_seq:[tensor(7.3761, device='cuda:1', grad_fn=<ThAddBackward>), tensor(4.3005, device='cuda:1', grad_fn=<ThAddBackward>), tensor(4.2209, device='cuda:1', grad_fn=<ThAddBackward>)]
grad_seq:[tensor([1.], device='cuda:1'), tensor([1.], device='cuda:1'), tensor([1.], device='cuda:1')]
```
有人可以告诉他们如何解决吗?
输入代码:
images = Variable(images).cuda(gpu)
label_yaw = Variable(labels[:,0]).cuda(gpu)
label_pitch = Variable(labels[:,1]).cuda(gpu)
label_roll = Variable(labels[:,2]).cuda(gpu)
pre_yaw, pre_pitch, pre_roll = model(images)
# Cross entropy loss
loss_yaw = criterion(pre_yaw, label_yaw)
loss_pitch = criterion(pre_pitch, label_pitch)
loss_roll = criterion(pre_roll, label_roll)
loss_yaw += 0.005 * loss_reg_yaw
loss_pitch += 0.005 * loss_reg_pitch
loss_roll += 0.005 * loss_reg_roll
loss_seq = [loss_yaw, loss_pitch, loss_roll]
grad_seq = [torch.ones(1).cuda(gpu) for _ in range(len(loss_seq))]
# crash here
torch.autograd.backward(loss_seq, grad_seq)
答案 0 :(得分:1)
我已经解决了这个问题。唯一的改变:
grad_seq = [torch.ones(1).cuda(gpu) for _ in range(len(loss_seq))]
收件人:
grad_seq = [torch.tensor(1.0).cuda(gpu) for _ in range(len(loss_seq))]