Question

我正在使用Paperspace进行训练，但我遇到了一些我以前从未见过的问题。我之前使用过同一台机器没有任何问题。培训似乎根本就没有开始。我已将批量大小减少到10（默认为24）。

还有其他人有这个问题吗？

这是我在models / research / object_detection中运行train.py时获得的输出，它已经持续了大约一个小时。

WARNING:tensorflow:From /home/paperspace/Documents/models/research/object_detection/trainer.py:210: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-11-27 12:08:46.994554: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2017-11-27 12:08:47.109823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-27 12:08:47.110204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro P4000 major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:00:05.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2017-11-27 12:08:47.110230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro P4000, pci bus id: 0000:00:05.0, compute capability: 6.1)
INFO:tensorflow:Restoring parameters from ssd_mobilenet_v1_coco_11_06_2017/model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Saving checkpoint to path training/model.ckpt

Answer 1

我认为您没有生成tf记录文件，请在研究文件夹generatetf.record文件中检查它是否适合火车，而测试文件是否为它们。如果不是由它们生成的，则从训练文件夹中删除除模型（faster_rcnn）和label.pbtxt文件以外的所有文件，然后开始训练！

Tensorflow对象检测API培训问题

1 个答案: