Question

有没有办法强制我希望特定Pytorch实例可用的GPU内存量的最大值？例如，我的GPU可能有12Gb可用，但我想为特定进程分配4Gb max。

Answer 1

更新（2021 年 3 月 4 日）：现在是 available in the stable 1.8.0 version of PyTorch。另外，in the docs

原始答案如下。

这个 feature request 已经 merged 进入 PyTorch master 分支。然而，没有在稳定版本中引入。

Introduced as set_per_process_memory_fraction

<块引用>

为进程设置内存分数。该分数用于将缓存分配器限制为在 CUDA 设备上分配的内存。允许的值等于总的可见内存乘数。如果尝试在进程中分配超过允许的值，将引发 out of 分配器内存错误。

您可以检查 tests 作为用法示例。

Answer 2

与可以阻止所有CPU内存的tensorflow相比，Pytorch只使用了它需要的数量＆＃39;。但是你可以：

减少批量大小
使用 CUDA_VISIBLE_DEVICES = ＃GPU（可以是倍数）来限制可以访问的GPU。

要在程序中运行，请尝试：

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"

Answer 3

更新pytorch到1.8.0 （pip install --upgrade torch==1.8.0）

函数：torch.cuda.set_per_process_memory_fraction(fraction, device=None)

参数：

分数（浮点数） – 范围：0~1。允许的内存等于 total_memory * 分数。

设备（torch.device 或 int，可选） – 选定的设备。如果是 None 则使用默认的 CUDA 设备。

例如：

import torch
torch.cuda.set_per_process_memory_fraction(0.5, 0)
torch.cuda.empty_cache()
total_memory = torch.cuda.get_device_properties(0).total_memory
# less than 0.5 will be ok:
tmp_tensor = torch.empty(int(total_memory * 0.499), dtype=torch.int8, device='cuda')
del tmp_tensor
torch.cuda.empty_cache()
# this allocation will raise a OOM:
torch.empty(total_memory // 2, dtype=torch.int8, device='cuda')

"""
It raises an error as follows: 
RuntimeError: CUDA out of memory. Tried to allocate 5.59 GiB (GPU 0; 11.17 GiB total capacity; 0 bytes already allocated; 10.91 GiB free; 5.59 GiB allowed; 0 bytes reserved in total by PyTorch)
"""

在PyTorch中强制GPU内存限制

3 个答案: