Question

我已经微调Model Zoo上可用的faster_rcnn_resnet101模型，以检测我的自定义对象。我将数据分为训练和评估集，训练时在配置文件中使用了它们。现在，训练结束后，我想在一个看不见的数据上测试我的模型（我称它为测试数据）。我使用了几个函数，但无法从tensorflow的API中确定要使用哪个代码来评估测试数据集的性能。以下是我尝试过的事情：

我使用object_detection / metrics / offline_eval_map_corloc.py函数对测试数据集进行评估。该代码运行正常，但对于大中型边框，我为负值或AR和AP。

平均精度（AP）@ [IoU = 0.50：0.95 |面积=全部| maxDets = 100] = 0.459

平均精度（AP）@ [IoU = 0.50 |面积=全部| maxDets = 100] = 0.601

平均精度（AP）@ [IoU = 0.75 |面积=全部| maxDets = 100] = 0.543

平均精度（AP）@ [IoU = 0.50：0.95 |面积=小| maxDets = 100] = 0.459

平均精度（AP）@ [IoU = 0.50：0.95 | area = medium | maxDets = 100] = -1.000

平均精度（AP）@ [IoU = 0.50：0.95 |面积=大| maxDets = 100] = -1.000

平均召回率（AR）@ [IoU = 0.50：0.95 |面积=全部| maxDets = 1] = 0.543

平均召回率（AR）@ [IoU = 0.50：0.95 |面积=全部| maxDets = 10] = 0.627

平均召回率（AR）@ [IoU = 0.50：0.95 |面积=全部| maxDets = 100] = 0.628

平均召回率（AR）@ [IoU = 0.50：0.95 |面积=小| maxDets = 100] = 0.628

平均召回率（AR）@ [IoU = 0.50：0.95 | area = medium | maxDets = 100] = -1.000

平均召回率（AR）@ [IoU = 0.50：0.95 |面积=大| maxDets = 100] = -1.000

现在，我知道mAP和AR不能为负，并且出了点问题。我想知道为什么在测试数据集上运行脱机评估时会看到负值吗？

我用来运行此管道的查询是： SPLIT = test

echo "
label_map_path: '/training_demo/annotations/label_map.pbtxt'
tf_record_input_reader: { input_path: '/training_demo/Predictions/test.record' }
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt

echo "
metrics_set: 'coco_detection_metrics'
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt 

python object_detection/metrics/offline_eval_map_corloc.py \
  --eval_dir='/training_demo/test_eval_metrics' \
  --eval_config_path='training_demo/test_eval_metrics/test_eval_config.pbtxt' \
  --input_config_path='/training_demo/test_eval_metrics/test_input_config.pbtxt'

我还尝试了object_detection / legacy / eval.py，但获得的评估指标值为负：

DetectionBoxes_Recall / AR @ 100（中）：-1.0 DetectionBoxes_Recall / AR @ 100（小）：-1.0 DetectionBoxes_Precision / mAP @ .50IOU：-1.0 DetectionBoxes_Precision / mAP（中）：-1.0 等等

我使用了管道， python eval.py \ --logtostderr \ --checkpoint_dir =训练后的推理图/ output_inference_graph / --eval_dir = test_eval_metrics \ --pipeline_config_path = training / faster_rcnn_resnet101_coco-Copy1.config

faster_rcnn_resnet101_coco-Copy1.config中的eval_input_reader指向具有基本事实和检测信息的测试TFRecord。

我也尝试了object_detection / utils / object_detection_evaluation以获得评估。这与使用第一种方法没什么不同，因为它没用相同的基本函数-evaluator.evaluate（）

在此方面，我将不胜感激。

Answer 1

评估指标为COCO格式，因此您可以参考COCO API了解这些值的含义。

如coco api code中所指定，如果不存在该类别，则-1是默认值。在您的情况下，检测到的所有对象仅属于“小”区域。同样，“小”，“中”和“大”的区域类别也取决于该区域在here中所指定的像素。

Answer 2

对我来说，我只运行一次model_main.py，然后将pipeline.config中的 eval_input_reader 更改为测试数据集。但是我不确定是否应该这样做。

python model_main.py \
    --alsologtostderr \
    --run_once \
    --checkpoint_dir=$path_to_model \
    --model_dir=$path_to_eval \
    --pipeline_config_path=$path_to_config

pipeline.config

eval_config: {
  metrics_set: "coco_detection_metrics"
  num_examples: 721 # no of test images
  num_visualizations: 10 # no of visualizations for tensorboard
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/path/to/test-data.record"
  }
  label_map_path: "/path/to/label_map.pbtxt"
  shuffle: true
  num_readers: 1
}

对我来说，验证和测试数据集之间的mAP也没有差异。因此，我不确定是否真的需要在训练，验证和 test 数据上进行拆分。

Answer 3

!python eval.py --logtostderr --pipeline_config_path=--checkpoint_dir--eval_dir=eval/

您可以在旧版文件夹中找到Eval.py

测试数据集上的Tensorflow对象检测模型评估

3 个答案: