以下循环不会丢弃每次循环导致内存泄漏后产生的任何张量。这是由于在以下代码中使用了grad_loss.backward()
。我有什么想念的吗?pytorch是否有问题。
for (images, one_hot_labels) in tqdm(batched_train_data):
# I collect batch size here because the last batch may have a smaller batch_size
images = images.to(device)
one_hot_labels = one_hot_labels.to(device)
batch_size = images.shape[0]
images.requires_grad = True
optimizer.zero_grad()
# as images is not a parameters optimizer.zero_grad() won't reset it's gradient
if images.grad is not None:
images.grad.data.zero_()
probabilities = model.forward(images)
# I want to use .backward() twice rather than autograd because I want to accumulate the gradients
loss = loss_func(probabilities, one_hot_labels)
loss.backward(create_graph=True)
grad_loss = grad_loss_func(images.grad)
grad_loss.backward()
optimizer.step()
labels = one_hot_labels.detach().argmax(dim=1)
predictions = probabilities.detach().argmax(dim=1)
num_correct = int(predictions.eq(labels).sum())
train_data_length += batch_size
train_correct += num_correct
train_loss += float(loss.detach()) * batch_size
writer.add_graph(model, images)
writer.close()
# To stop memory leaks
del images
del one_hot_labels
del probabilities
del loss
del grad_loss
del labels
del predictions
del num_correct
答案 0 :(得分:0)
要修复此问题,您需要替换
images.grad.data.zero_()
与
images.grad = None
我认为这是因为执行images.grad.data.zero_()不会删除与图像相关的任何计算图,因此允许该图随着循环遍历而增长。
另外,我还建议您避免在.data
上进行操作,因为这样做不安全。
答案 1 :(得分:0)
如果您想在代码的某个部分中不想为反向传播构建图形,请使用:
with torch.no_grad():
#here goes the code