当tensorflow已经运行并使用GPU时,我无法启动张量板实例。错误如下。显然,Tensorflow会在启动时阻止所有GPU内存,而与实际需要的内存无关。有没有办法在张量流程运行时启动张量板或者它是否始终首先启动?
totalMemory: 5,93GiB freeMemory: 41,56MiB
2018-06-02 15:28:11.053634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-02 15:28:11.321850: E tensorflow/core/common_runtime/direct_session.cc:154] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory
Traceback (most recent call last):
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/bin/tensorboard", line 11, in <module>
sys.exit(run_main())
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/main.py", line 36, in run_main
tf.app.run(main)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/main.py", line 45, in main
default.get_assets_zip_provider())
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/program.py", line 166, in main
tb = create_tb_app(plugins, assets_zip_provider)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/program.py", line 201, in create_tb_app
flags=FLAGS)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/backend/application.py", line 126, in standard_tensorboard_wsgi
plugin_instances = [constructor(context) for constructor in plugins]
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/backend/application.py", line 126, in <listcomp>
plugin_instances = [constructor(context) for constructor in plugins]
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/plugins/beholder/beholder_plugin.py", line 47, in __init__
self.most_recent_frame = im_util.get_image_relative_to_script('no-data.png')
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/plugins/beholder/im_util.py", line 254, in get_image_relative_to_script
return read_image(filename)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/plugins/beholder/im_util.py", line 242, in read_image
return np.array(decode_png(image_file.read()))
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/plugins/beholder/im_util.py", line 159, in __call__
self._lazily_initialize()
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorboard/plugins/beholder/im_util.py", line 137, in _lazily_initialize
self._session = tf.Session(graph=graph, config=config)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1560, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/pascalwhoop/Documents/Code/University/powerTAC/python-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 633, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
答案 0 :(得分:1)
Tensorboard 1.7.0似乎占用了大约150MB的GPU。见this open Tensorboard issue。看起来它正在被解决。
过渡期间的一个选项是限制内存百分比Tensorflow允许按照详细in this answer预先为每个进程分配。这样,您可以确保为GPU上您可能希望在培训期间运行的其他任务保留一定比例的内存。
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.8)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))