我实际上已经训练了一个Tensorflow对象检测模型,希望获取检测到的图像的边界框坐标以计算诸如mAP,Recall等指标。为了获取边界框坐标,我正在使用以下代码。
next()
为了举例,我目前仅使用一个具有3个汽车实例的测试图像。运行上述代码后,模型做出的预测如下
for image_path in TEST_IMAGES_PATH:
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=2)
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(image_np)
width = 1024
height = 600
for i,j in zip(output_dict['detection_boxes'],output_dict['detection_scores']):
if(j>0.5):
print(i[0]*width,i[2]*height,i[1]*width,i[3]*height)
但是,观察地面真相标签后,我发现预测的执行顺序实际上与地面真相标签的顺序不同,如下所示:
510.1 354.7 633.5 423.2(car X)
475.3 399.9 743.9 568.8(car Y)
491.3 332.3 36.2 58.6(car Z)
由于这个问题,我不太确定如何使用预测的边界框来计算诸如mAP和召回率之类的指标。有什么方法可以肯定地确定图像中检测到的每个物体的对应地面标签。