我正在尝试用此this tutorial完成source code
我尝试使用它们的大图像数据以及我自己的52个图像的小数据集(46x46),但我一直遇到ResourceExhaustedError
ResourceExhaustedError OOM when allocating tensor with shape[1016064,1024]
有什么办法可以编辑此代码,以便在较小的训练集上进行训练,从而避免出现此错误?
我尝试更改代码中的批处理大小,但这没有完成。我还确保我没有任何先前的tensorflow项目正在运行(我重新启动了计算机)
我的label.txt包含这两行:
cat
dog
,我的火车和验证文件夹包含2个具有相同名称的子文件夹,其中包含图像。
我正在使用: GeForce GTX 850M主要:5个次要:0 memoryClockRate(GHz):0.9015
总内存:4.00GiB空闲内存:3.35GiB
在我看到错误之前,我先打印出了这张照片:
Limit: 3235767910
InUse: 223232
MaxInUse: 223232
NumAllocs: 17
MaxAllocSize: 204800
这是我的全部错误:
2018-07-01 14:55:45.724585: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:279] *___________________________________________________________________________________________________
2018-07-01 14:55:45.725147: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at random_op.cc:202 : Resource exhausted: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1361, in _do_call
return fn(*args)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 175, in <module>
tf.app.run()
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 167, in main
classifier.train(input_fn=lambda: train_input_fn(train_list), steps=10, hooks=[logging_hook])
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 352, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 888, in _train_model
log_step_count_steps=self._config.log_step_count_steps) as mon_sess:
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 384, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 795, in __init__
stop_grace_period_secs=stop_grace_period_secs)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 518, in __init__
self._sess = _RecoverableSession(self._coordinated_creator)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 981, in __init__
_WrappedSession.__init__(self, self._create_session())
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 986, in _create_session
return self._sess_creator.create_session()
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 675, in create_session
self.tf_sess = self._session_creator.create_session()
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\monitored_session.py", line 446, in create_session
init_fn=self._scaffold.init_fn)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\training\session_manager.py", line 281, in prepare_session
sess.run(init_op, feed_dict=init_feed_dict)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
run_metadata_ptr)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1355, in _do_run
options, run_metadata)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'dense/kernel/Initializer/random_uniform/RandomUniform', defined at:
File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 175, in <module>
tf.app.run()
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 167, in main
classifier.train(input_fn=lambda: train_input_fn(train_list), steps=10, hooks=[logging_hook])
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 352, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 812, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 793, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 50, in cnn_model_fn
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\layers\core.py", line 248, in dense
return layer.apply(inputs)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\layers\base.py", line 809, in apply
return self.__call__(inputs, *args, **kwargs)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\layers\base.py", line 680, in __call__
self.build(input_shapes)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\layers\core.py", line 134, in build
trainable=True)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\layers\base.py", line 533, in add_variable
partitioner=partitioner)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1297, in get_variable
constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1093, in get_variable
constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 439, in get_variable
constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 408, in _true_getter
use_resource=use_resource, constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 800, in _get_single_variable
use_resource=use_resource)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2157, in variable
use_resource=use_resource)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2147, in <lambda>
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2130, in default_variable_creator
constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variables.py", line 233, in __init__
constraint=constraint)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variables.py", line 327, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 784, in <lambda>
shape.as_list(), dtype=dtype, partition_info=partition_info)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\init_ops.py", line 472, in __call__
shape, -limit, limit, dtype, seed=self.seed)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\random_ops.py", line 244, in random_uniform
shape, dtype, seed=seed1, seed2=seed2)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_random_ops.py", line 473, in _random_uniform
name=name)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3271, in create_op
op_def=op_def)
File "C:\Users\Mac\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1650, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
答案 0 :(得分:2)
OOM是由密集层线50的分配引起的:
pool2_flat = tf.reshape(pool2, [-1, 126 * 126 * 64])
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
您可以:
顺便说一句,我强烈建议不要在tf.reshape中使用硬编码形状。也许使用tf.layers.flatten,它对体系结构修改很健壮。