我在提供的conda环境tensorflow_p36
内的深度学习AMI(ubuntu)ec2实例上运行this DeepLab example。
我从~/models/research
正在运行:
PATH_TO_INITIAL_CHECKPOINT=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/train/model.ckpt-0.index
PATH_TO_TRAIN_DIR=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval
PATH_TO_DATASET=/home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/VOCdevkit/VOC2012
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=30000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=1 \
--dataset="pascal_voc_seg" \
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT} \
--train_logdir=${PATH_TO_TRAIN_DIR} \
--dataset_dir=${PATH_TO_DATASET}
输出为
INFO:tensorflow:Training on train set
INFO:tensorflow:Ignoring initialization; other checkpoint exists
Traceback (most recent call last):
File "deeplab/train.py", line 500, in <module>
tf.app.run()
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "deeplab/train.py", line 492, in main
hooks=[stop_hook]) as sess:
TypeError: MonitoredTrainingSession() got an unexpected keyword argument 'summary_dir'
如果我注释掉~/models/research/deeplab/train.py
,summary_dir=FLAGS.train_logdir,
的第488行并再次从上方运行代码,则结果为
INFO:tensorflow:Training on train set
INFO:tensorflow:Ignoring initialization; other checkpoint exists
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
2019-03-16 02:34:49.871433: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Restoring parameters from /home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval/model.ckpt-0
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/ubuntu/models/research/deeplab/datasets/pascal_voc_seg/exp/train_on_trainval_set/eval/model.ckpt.
有想法吗?
答案 0 :(得分:0)
我将我的tensorflow版本升级到1.10.0。 我也upgrade cuda版本为9.0,但不知道它是否是强制性的。
然后使用Python3运行。
它解决了我的问题。
您可以尝试一下。希望会有所帮助。