我正在运行TensorFlow示例https://github.com/tensorflow/models/tree/master/tutorials/image/cifar10_estimator的修改版本,并且内存不足。
ResourceExhausted错误说:
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
我尝试将其添加到main()的明显位置,但是我收到了protobuf错误的变体,该错误表示未找到report_tensor_allocations_upon_oom运行选项。
def main(job_dir, data_dir, num_gpus, variable_strategy,
use_distortion_for_training, log_device_placement, num_intra_threads,
**hparams):
# The env variable is on deprecation path, default is set to off.
os.environ['TF_SYNC_ON_FINISH'] = '0'
os.environ['TF_ENABLE_WINOGRAD_NONFUSED'] = '1'
# Session configuration.
sess_config = tf.ConfigProto(
allow_soft_placement=True,
log_device_placement=log_device_placement,
intra_op_parallelism_threads=num_intra_threads,
report_tensor_allocations_upon_oom = True, # Nope
gpu_options=tf.GPUOptions(
force_gpu_compatible=True,
report_tensor_allocations_upon_oom = True)) # Nope
config = cifar10_utils.RunConfig(
session_config=sess_config, model_dir=job_dir,
report_tensor_allocations_upon_oom = True) #Nope
tf.contrib.learn.learn_runner.run(
get_experiment_fn(data_dir, num_gpus, variable_strategy,
use_distortion_for_training),
run_config=config,
hparams=tf.contrib.training.HParams(
is_chief=config.is_chief,
**hparams))
在此示例中,我在哪里添加report_tensor_allocations_upon_oom = True
?
答案 0 :(得分:2)
您需要注册一个会话运行挂钩,以将额外的参数传递给估算器执行的session.run()
调用。
class OomReportingHook(SessionRunHook):
def before_run(self, run_context):
return SessionRunArgs(fetches=[], # no extra fetches
options=tf.RunOptions(
report_tensor_allocations_upon_oom=True))
将hooks
列表中的钩子传递给估算器中的相关方法:
https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator