Question

以下循环不会丢弃每次循环导致内存泄漏后产生的任何张量。这是由于在以下代码中使用了grad_loss.backward()。我有什么想念的吗？pytorch是否有问题。

    for (images, one_hot_labels) in tqdm(batched_train_data):
        # I collect batch size here because the last batch may have a smaller batch_size
        images = images.to(device)
        one_hot_labels = one_hot_labels.to(device)

        batch_size = images.shape[0]

        images.requires_grad = True
        optimizer.zero_grad()
        # as images is not a parameters optimizer.zero_grad() won't reset it's gradient
        if images.grad is not None:
            images.grad.data.zero_()

        probabilities = model.forward(images)

        # I want to use .backward() twice rather than autograd because I want to accumulate the gradients
        loss = loss_func(probabilities, one_hot_labels)
        loss.backward(create_graph=True)
        grad_loss = grad_loss_func(images.grad)
        grad_loss.backward()

        optimizer.step()

        labels = one_hot_labels.detach().argmax(dim=1)
        predictions = probabilities.detach().argmax(dim=1)
        num_correct = int(predictions.eq(labels).sum())

        train_data_length += batch_size
        train_correct += num_correct
        train_loss += float(loss.detach()) * batch_size

        writer.add_graph(model, images)
        writer.close()

        # To stop memory leaks
        del images
        del one_hot_labels
        del probabilities
        del loss
        del grad_loss
        del labels
        del predictions
        del num_correct

Answer 1

要修复此问题，您需要替换

images.grad.data.zero_()

与

images.grad = None

我认为这是因为执行images.grad.data.zero_（）不会删除与图像相关的任何计算图，因此允许该图随着循环遍历而增长。

另外，我还建议您避免在.data上进行操作，因为这样做不安全。

Answer 2

如果您想在代码的某个部分中不想为反向传播构建图形，请使用：

with torch.no_grad():
  #here goes the code

循环pytorch中的内存泄漏

2 个答案: