Question

在我的应用程序中，我重新使用在ImageNet上训练的现有MobileNet，并仅使用5个类重新训练鲜花数据集上的输出图层。重新训练的模型将保存到磁盘。然后，加载模型并在几次迭代期间执行评估，这最终导致内存耗尽并且整个应用程序崩溃。在做了一些诊断之后，我意识到泄漏来自model.evaluate（）keras方法。该问题可以在独立的示例代码中重现：

import os
import resource
import keras
import numpy as np

if __name__ == '__main__':
    init_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

    for it in range(4):
        x_valid = np.random.uniform(0, 1, (64, 224, 224, 3)).astype(np.float32)
        y_valid = keras.utils.to_categorical(np.random.uniform(0, 5, (64, )).astype(np.int32), 5)

        start_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

        model =  keras.models.load_model(os.path.abspath(os.path.join('.', 'mobilenet_flowers.h5')),
                                         custom_objects={'relu6': keras.applications.mobilenet.relu6,
                                                         'DepthwiseConv2D': keras.applications.mobilenet.DepthwiseConv2D})

        loss, _ = model.evaluate(x_valid, y_valid, batch_size=64, verbose=False)

        keras.backend.clear_session()
        del model

        end_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

        print('Iteration %d:' % it)
        print('  Memory alloc before evaluate() is %7d kilobytes'   % start_alloc)
        print('  Memory alloc after  evaluate() is %7d kilobytes'   % end_alloc)
        print('  Memory alloc loss for evaluate is %7d kilobytes\n' % (end_alloc - start_alloc))

    exit_alloc = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

    print('Memory alloc before loop is %7d kilobytes' % init_alloc)
    print('Memory alloc after  loop is %7d kilobytes' % exit_alloc)
    print('Memory alloc difference  is %7d kilobytes' % (exit_alloc - init_alloc))

执行脚本时，会打印出以下内容：

Iteration 0:
  Memory alloc before evaluate() is  251864 kilobytes
  Memory alloc after  evaluate() is  901696 kilobytes
  Memory alloc loss for evaluate is  649832 kilobytes

Iteration 1:
  Memory alloc before evaluate() is  901696 kilobytes
  Memory alloc after  evaluate() is 1036780 kilobytes
  Memory alloc loss for evaluate is  135084 kilobytes

Iteration 2:
  Memory alloc before evaluate() is 1036780 kilobytes
  Memory alloc after  evaluate() is 1148692 kilobytes
  Memory alloc loss for evaluate is  111912 kilobytes

Iteration 3:
  Memory alloc before evaluate() is 1148692 kilobytes
  Memory alloc after  evaluate() is 1190804 kilobytes
  Memory alloc loss for evaluate is   42112 kilobytes

Memory alloc before loop is  138792 kilobytes
Memory alloc after  loop is 1190804 kilobytes
Memory alloc difference  is 1052012 kilobytes

有什么建议可能会出错吗？通过论坛后，我尝试添加K.clear_session（），但是，正如您在代码中看到的那样，这并没有帮助。该模型临时存储在https://ufile.io/rgaxs。

有关我的环境的一些其他信息：

== cat /etc/issue ===============================================
Linux 4.10.0-38-generic #42~16.04.1-Ubuntu SMP Tue Oct 10 16:32:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
VERSION="16.04.3 LTS (Xenial Xerus)"
VERSION_ID="16.04"
VERSION_CODENAME=xenial

== are we in docker =============================================
No

== compiler =====================================================
c++ (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

== check pips ===================================================
numpy (1.12.1)
numpydoc (0.7.0)
protobuf (3.5.0)
tensorflow (1.4.0)
tensorflow-tensorboard (0.4.0rc3)

== check for virtualenv =========================================
False

== tensorflow import ============================================
tf.VERSION = 1.4.0
tf.GIT_VERSION = v1.4.0-rc1-11-g130a514
tf.COMPILER_VERSION = v1.4.0-rc1-11-g130a514
keras.VERSION = 2.0.9

重新训练的keras模型评估在循环中调用时会泄漏内存

0 个答案: