我正在使用tf(1.1.0和1.2.0) models / slim / scripts 来finetune_inception_v3我的2个自定义" logos"图像数据集,之前已转换为TFRecords
所有在Docker for windows中的tensorflow容器没有GPU
docker run -it -v c:/tf_files:/tf_files gcr.io/tensorflow/tensorflow:1.1.0-devel (also 1.2.0-devel)
bazel-bin/inception/build_image_data \
--train_directory="${TRAIN_DIR}" \
--validation_directory="${VALIDATION_DIR}" \
--output_directory="${OUTPUT_DIRECTORY}" \
--labels_file="${LABELS_FILE}" \
--train_shards=2 \
--validation_shards=2 \
--num_threads=2
使用LABELS_FILE = / tmp / data / labels.txt获取2个图像类别
ProperLogos
OtherLogos
/slimtf# git clone https://github.com/tensorflow/models.git
将 logos.py 添加到数据集目录并更新 dataset_factory.py 以支持"徽标"自定义数据集
更改了 datatset_utils.py ,其中更改了LABELS_FILENAME =' labels2.txt'
因为它预期" labels_id冒号名称"我在代码中找到格式
0:ProperLogos
1:OtherLogos
/slimtf/models/slim# ./scripts/finetune_inception_v3_on_logos.sh
启动1st cmd train_image_classifier.py
从/tmp/checkpoints/inception_v3.ckpt
# Fine-tune only the new layers for 1000 steps.
python train_image_classifier.py \
--train_dir=${TRAIN_DIR} \
--dataset_name=logos \
--dataset_split_name=train \
--dataset_dir=${DATASET_DIR} \
--model_name=inception_v3 \
--checkpoint_path=${PRETRAINED_CHECKPOINT_DIR}/inception_v3.ckpt \
--checkpoint_exclude_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
--trainable_scopes=InceptionV3/Logits,InceptionV3/AuxLogits \
--max_number_of_steps=1000 \
--batch_size=12 \
--learning_rate=0.01 \
--learning_rate_decay_type=fixed \
--save_interval_secs=60 \
--save_summaries_secs=60 \
--log_every_n_steps=100 \
--clone_on_cpu=True \
--optimizer=rmsprop \
--weight_decay=0.00004
它在FailedPreconditionError上失败:/ tmp / logos / train
/slimtf/models/slim# ./scripts/finetune_inception_v3_on_logos.sh
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
INFO:tensorflow:Ignoring --checkpoint_path because a checkpoint already exists in /tmp/models-logos4a/inception_v3
2017-07-02 02:11:51.859377: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-02 02:11:51.859402: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-02 02:11:51.859408: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-02 02:11:51.859413: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-02 02:11:51.859418: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
INFO:tensorflow:Restoring parameters from /tmp/models-logos4a/inception_v3/model.ckpt-0
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path /tmp/models-logos4a/inception_v3/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, /tmp/logos/train
[[Node: parallel_read/ReaderReadV2_1 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](parallel_read/TFRecordReaderV2_1, parallel_read/filenames)]]
2017-07-02 02:11:55.725083: W tensorflow/core/kernels/queue_base.cc:303] _6_prefetch_queue/fifo_queue: Skipping cancelled dequeue attempt with queue not closed
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Caught OutOfRangeError. Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
Traceback (most recent call last):
File "train_image_classifier.py", line 573, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train_image_classifier.py", line 569, in main
sync_optimizer=optimizer if FLAGS.sync_replicas else None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 767, in train
sv.stop(threads, close_summary_writer=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 238, in _run
enqueue_callable()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1193, in _single_operation_run
target_list_as_strings, status, None)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: /tmp/logos/train
[[Node: parallel_read/ReaderReadV2_1 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](parallel_read/TFRecordReaderV2_1, parallel_read/filenames)]]
我的输入目录中是否有任何缺少列车和验证的内容? 的/ tmp /标识/列车
train-00000-of-00002 train-00001-of-00002
的/ tmp /标识/验证
validation-00000-of-00002 validation-00001-of-00002