针对TensorFlow 2.0 Alpha及更高版本进行了更新

Question

我在一个共享计算资源的环境中工作，也就是说，我们有一些服务器机器配备了几个Nvidia Titan X GPU。

对于小到中等大小的型号，12GB的Titan X通常足以让2-3人在同一GPU上同时进行训练。如果模型足够小以至于单个模型没有充分利用Titan X的所有计算单元，那么与在另一个训练过程之后运行一个训练过程相比，这实际上可以导致加速。即使在并发访问GPU确实减慢了单个培训时间的情况下，仍然可以灵活地让多个用户同时在GPU上运行。

TensorFlow的问题在于，默认情况下，它在启动时会在GPU上分配全部可用内存。即使对于一个小型的2层神经网络，我也看到12 GB的Titan X已用完了。

有没有办法让TensorFlow只分配4GB的GPU内存，如果知道这个数量对于给定的模型来说足够了？

Answer 1

通过传递tf.Session作为可选config参数的一部分来构造tf.GPUOptions时，可以设置要分配的GPU内存的分数：

# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

per_process_gpu_memory_fraction充当GPU内存量的硬上限，该内存将由同一台机器上的每个GPU上的进程使用。目前，该分数统一应用于同一台机器上的所有GPU;没有办法在每GPU的基础上设置它。

Answer 2

config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

https://github.com/tensorflow/tensorflow/issues/1578

Answer 3

以下是本书Deep Learning with TensorFlow

的摘录

在某些情况下，过程最好只分配可用内存的子集，或者只增加进程所需的内存使用量。 TensorFlow在会话中提供两个配置选项来控制它。第一个是allow_growth选项，它尝试仅基于运行时分配分配尽可能多的GPU内存，它开始分配非常少的内存，并且随着会话运行并需要更多GPU内存，我们扩展GPU内存TensorFlow流程所需的区域。

1）允许增长:(更灵活）

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

第二种方法是per_process_gpu_memory_fraction选项，它确定应该分配each可见GPU的总内存量的分数。 注意：不需要释放内存，完成后甚至会使内存碎片恶化。

2）分配固定内存：

仅通过以下方式分配每个GPU的总内存40%：

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

注意：如果您真的想要绑定TensorFlow流程上可用的GPU内存量，那么这只是有用的。

Answer 4

上面的所有答案都假定使用sess.run()调用执行，这将成为例外，而不是最近版本的TensorFlow中的规则。

使用tf.Estimator框架（TensorFlow 1.4及更高版本）时，将分数传递给隐式创建的MonitoredTrainingSession的方法是，

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=..., 
                       config=trainingConfig)

类似于Eager模式（TensorFlow 1.5及以上版本），

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)

编辑：11-04-2018 例如，如果您要使用tf.contrib.gan.train，那么您可以使用类似于下面的内容：

tf.contrib.gan.gan_train(........, config=conf)

Answer 5

针对TensorFlow 2.0 Alpha及更高版本进行了更新

From the 2.0 Alpha docs, the answer is now just one line before you do anything with TensorFlow:

import tensorflow as tf
tf.config.gpu.set_per_process_memory_growth(True)

Answer 6

如果您正在使用Tensorflow 2，请尝试以下操作：

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

Answer 7

无耻插件：如果您安装支持GPU的Tensorflow，会话将首先分配所有GPU，无论您将其设置为仅使用CPU还是GPU。我可以添加我的提示，即使您将图表设置为仅使用CPU，您应该设置相同的配置（如上面的回答:)）以防止不必要的GPU占用。

在像IPython这样的交互式界面中，您还应该设置configure，否则它将分配所有内存并且几乎不会为其他内存。有时很难注意到这一点。

Answer 8

对于 Tensorflow 2.0 ，此this solution对我有用。（TF-GPU 2.0，Windows 10，GeForce RTX 2070）

physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)

Answer 9

对于Tensorflow 2.0版，请使用以下代码段：

 import tensorflow as tf
 gpu_devices = tf.config.experimental.list_physical_devices('GPU')
 tf.config.experimental.set_memory_growth(gpu_devices[0], True)

对于以前的版本，以下代码段对我有用：

import tensorflow as tf
tf_config=tf.ConfigProto()
tf_config.gpu_options.allow_growth=True
sess = tf.Session(config=tf_config)

Answer 10

好吧，我是tensorflow的新手，我有Geforce 740m或具有2GB ram的GPU，我正在运行mnist手写的本地语言示例，训练数据包含38700张图像和4300张测试图像，并试图提高精度，回想一下，F1使用以下代码作为sklearn并没有给我准确的结果。将其添加到现有代码中后，我开始收到GPU错误。

TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)

prec = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * prec * recall / (prec + recall)

加上我的模型很沉重，我想是在147、148个纪元后出现内存错误，然后我想为什么不为这些任务创建函数，所以我不知道它是否在tensrorflow中以这种方式工作，但是我想如果使用局部变量，并且在超出范围时可能释放内存，并且我在模块中定义了用于训练和测试的上述元素，我能够实现10000个纪元而没有任何问题，希望对您有所帮助。

Answer 11

此代码对我有用：

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)

Answer 12

我尝试对voc数据集进行unet训练，但是由于巨大的图像大小，内存完成了。我尝试了上述所有技巧，甚至尝试使用批处理大小== 1，但没有任何改善。有时TensorFlow版本也会导致内存问题。尝试使用

pip install tensorflow-gpu == 1.8.0

Answer 13

Tensorflow 2.0 Beta和（可能）超出

API再次更改。现在可以在以下位置找到它：

tf.config.experimental.set_memory_growth(
    device,
    enable
)

别名：

tf.compat.v1.config.experimental.set_memory_growth
tf.compat.v2.config.experimental.set_memory_growth
tf.config.experimental.set_memory_growth

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/config/experimental/set_memory_growth https://www.tensorflow.org/beta/guide/using_gpu#limiting_gpu_memory_growth

Answer 14

# allocate 60% of GPU memory 
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf 
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
set_session(tf.Session(config=config))

Answer 15

以上所有答案都涉及在TensorFlow 1.X版本中将内存设置为一定程度，或者在TensorFlow 2.X中允许内存增加。

方法tf.config.experimental.set_memory_growth确实可以在分配/预处理期间实现动态增长。尽管如此，可能还是希望从一开始就分配一个特定的GPU内存。

分配特定GPU内存背后的逻辑也将是在训练期间防止OOM内存。例如，如果一个人在打开消耗视频内存的Chrome标签时进行训练，则tf.config.experimental.set_memory_growth(gpu, True)可能会引发OOM错误，因此在某些情况下有必要从头开始分配更多的内存。

在TensorFlow 2.X中为每个GPU分配内存的建议和正确方法是通过以下方式完成的：

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only allocate 1GB of memory on the first GPU
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]

Answer 16

您可以使用

TF_FORCE_GPU_ALLOW_GROWTH=true

在您的环境变量中。

在tensorflow代码中：

bool GPUBFCAllocator::GetAllowGrowthValue(const GPUOptions& gpu_options) {
  const char* force_allow_growth_string =
      std::getenv("TF_FORCE_GPU_ALLOW_GROWTH");
  if (force_allow_growth_string == nullptr) {
    return gpu_options.allow_growth();
}

如何防止张量流分配整个GPU内存？

16 个答案:

针对TensorFlow 2.0 Alpha及更高版本进行了更新

Tensorflow 2.0 Beta和（可能）超出