当我仅训练而不验证时,该模型可以很好地训练,但是在评估期间它用尽了内存,但是我不明白为什么这可能是个问题,尤其是因为我使用了torch.no_grad()
?
def test(epoch,net,testloader,optimizer):
net.eval()
test_loss = 0
correct = 0
total = 0
idx = 0
features_all = []
for batch_idx, (inputs, targets) in enumerate(testloader):
with torch.no_grad():
idx = batch_idx
# inputs, targets = inputs.cpu(), targets.cpu()
if use_cuda:
inputs, targets = inputs.cuda(), targets.cuda()
inputs, targets = Variable(inputs), Variable(targets)
save_features, out, ce_loss = net(inputs,targets)
test_loss += ce_loss.item()
_, predicted = torch.max(out.data, 1)
total += targets.size(0)
correct += predicted.eq(targets.data).cpu().sum().item()
features_all.append((save_features, predicted, targets.data))
test_acc = 100.*correct/total
test_loss = test_loss/(idx+1)
logging.info('test, test_acc = %.4f,test_loss = %.4f' % (test_acc,test_loss))
print('test, test_acc = %.4f,test_loss = %.4f' % (test_acc,test_loss))
return features_all, test_acc
答案 0 :(得分:1)
features_all.append((save_features, predicted, targets.data))
此行将对张量的引用保存在GPU内存中,因此当循环进行下一次迭代时,CUDA内存将不会释放(这最终会导致GPU内存不足)。保存它们时,将张量移动到CPU(使用.cpu()
)。