如何从内存不足的pytroch CUDA中恢复?

时间:2018-12-18 16:13:58

标签: pytorch

我尝试了以下代码。当try中的代码由于CUDA内存不足而失败时,我在except中将批处理大小减小了一半,但在except中运行模型仍然出现相同的问题,但是我因为我已尝试直接在except中运行代码而不尝试整个批处理,所以确保一半的批处理大小是可运行的。工作正常。顺便说一句,有没有办法自动设置批处理大小以完全使用CUDA内存而不会溢出?

try:
    output = model(Variable(torch.LongTensor(np.array(x))).to(device),Variable(torch.LongTensor(np.array(pos))).to(device),Variable(torch.LongTensor(np.array(m))).to(device))
    loss = criterion(output, Variable(torch.LongTensor(y)).to(device))#lb.transform(y)
    loss.backward()
    optimizer.step()
    losses.append(loss.data.mean())
except:
    half = int(len(x) / 2)
    x1 = x[:half]
    x2 = x[half:]
    pos1 = pos[:half]
    pos2 = pos[half:]
    m1 = m[:half]
    m2 = m[half:]
    y1 = y[:half]
    y2 = y[half:]
    optimizer.zero_grad()
    output = model(Variable(torch.LongTensor(np.array(x1))).to(device),Variable(torch.LongTensor(np.array(pos1))).to(device),Variable(torch.LongTensor(np.array(m1))).to(device))
    loss = criterion(output, Variable(torch.LongTensor(y1)).to(device))#lb.transform(y)
    loss.backward()
    optimizer.step()
    losses.append(loss.data.mean())
    output = model(Variable(torch.LongTensor(np.array(x2))).to(device),Variable(torch.LongTensor(np.array(pos2))).to(device),Variable(torch.LongTensor(np.array(m2))).to(device))
    loss = criterion(output, Variable(torch.LongTensor(y2)).to(device))#lb.transform(y)
    loss.backward()
    optimizer.step()
    losses.append(loss.data.mean())

1 个答案:

答案 0 :(得分:0)

您的GPU上似乎还剩下一些东西。您是否尝试在except的开头使用torch.cuda.empty_cache()释放cuda缓存?