DeepLab Tensorflow:TypeError:MonitoredTrainingSession()获得了意外的关键字参数'summary_dir'

时间:2019-03-16 02:36:57

标签: python tensorflow machine-learning

我在提供的conda环境tensorflow_p36内的深度学习AMI(ubuntu)ec2实例上运行this DeepLab example

我从~/models/research正在运行:

PATH_TO_INITIAL_CHECKPOINT=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt-0.index

PATH_TO_TRAIN_DIR=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval

PATH_TO_DATASET=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/VOCdevkit/VOC2012

python deeplab/train.py \
    --logtostderr \
    --training_number_of_steps=30000 \
    --train_split="train" \
    --model_variant="xception_65" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --train_crop_size=513 \
    --train_crop_size=513 \
    --train_batch_size=1 \
    --dataset="pascal_voc_seg" \
    --tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
    --train_logdir=${PATH_TO_TRAIN_DIR} \
    --dataset_dir=${PATH_TO_DATASET}

输出为

INFO:tensorflow:Training on train set
INFO:tensorflow:Ignoring initialization; other checkpoint exists
Traceback (most recent call last):
  File "deeplab/train.py", line 500, in <module>
    tf.app.run()
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "deeplab/train.py", line 492, in main
    hooks=[stop_hook]) as sess:
TypeError: MonitoredTrainingSession() got an unexpected keyword argument 'summary_dir'

如果我注释掉~/models/research/deeplab/train.pysummary_dir=FLAGS.train_logdir,的第488行并再次从上方运行代码,则结果为

INFO:tensorflow:Training on train set
INFO:tensorflow:Ignoring initialization; other checkpoint exists
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-03-16 02:34:49.871433: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Restoring parameters from /home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval/model.ckpt-0
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval/model.ckpt.

有想法吗?

1 个答案:

答案 0 :(得分:0)

我将我的tensorflow版本升级到1.10.0。 我也upgrade cuda版本为9.0,但不知道它是否是强制性的。

然后使用Python3运行。

它解决了我的问题。

您可以尝试一下。希望会有所帮助。