Question

我正在PyTorch中运行评估脚本。我有许多训练有素的模型（* .pt文件），已加载并移动到GPU，总共占用了270MB的GPU内存。我使用的批量大小为1。对于每个样本，我加载一个图像并将其移至GPU。然后，根据样本，我需要运行这些训练有素的模型的序列。一些模型具有张量作为输入和输出。其他模型将张量作为输入，但将字符串作为输出。序列总是中的最终模型具有一个字符串作为输出。中间张量被临时存储在字典中。当模型使用张量输入时，将使用del将其删除。不过，我注意到在每次采样之后，GPU内存都会不断增加，直到整个内存已满。

下面是一些伪代码，可让您更好地了解发生的情况：

with torch.no_grad():
    trained_models = load_models_from_pt() # Loaded and moved to GPU, taking 270MB
    model = Model(trained_models) # Keeps the trained_models in a dictionary by name
    for sample in data_loader:
        # A sample contains a single image and is moved to the GPU
        # A sample also has some other information, but no other tensors
        model.forward(sample)

class Model(nn.Module)
    def __init__(self, trained_models):
        self.trained_models = trained_models
        self.intermediary = {}

    def forward(sample):
        for i, elem in enumerate(sample['sequence']):
             name = elem['name']
             in = elem['input']
             if name == 'a':
                model = self.trained_models['a']
                out = model(self.intermediary[in])
                del self.intermediary[in]
                self.intermediary[i] = out
             elif name == 'b':
                model self.trained_models['b']
                out = model(self.intermediary[in])
                del self.intermediary[in]
                self.intermediary[i] = out
             elif ...

我不知道为什么GPU内存不足。有什么想法吗？

Answer 1

尝试在del后面添加torch.cuda.empty_cache（）

PyTorch GPU内存不足

1 个答案: