Question

目前我正在使用tensorflow的对象检测API进行一些研究。为此我遵循了本教程：

https://www.oreilly.com/ideas/object-detection-with-tensorflow

本教程介绍如何从图像以及PASCAL VOC xml标签文件生成tfrecord。以及对象检测API的入门。

为了生成这些tfrecords，我修改了github上引用的raccoon存储库中的一些代码：

https://github.com/datitran/raccoon_dataset

我用LabelImg（https://github.com/tzutalin/labelImg）标记了我的图像，你可以用PASCAL VOC格式保存。

所以现在我按照教程进行了第一次测试（训练），有60张图像，一小时后（574步）我做了一个中断来保存检查点。在此之后，我为inference.py＆＃34;做了导出＆＃34;图表。并且保存了冻结的模型（如果我说的是愚蠢的话，请纠正我，这些东西对我来说也是新的......）

在此之后，我从教程中为我的欲望和tada修改了jupyter笔记本，在测试图像中有一些认可。

到目前为止一切都那么好，但现在我想看看对象检测有多好（精确度），为此我想从我的测试PASCAL VOC数据集中添加一些基础真值盒。但是我有一些麻烦来实现我的目标。

我做的第一件事是手动添加我从我的VOC数据集中读取的框，并将它们添加到我使用https://matplotlib.org/devdocs/api/_as_gen/matplotlib.patches.Rectangle.html

创建的图像中

但在我的解决方案中，这是不同的情节/数字......

那么我想也许对象检测API提供了一些功能来添加盒子/地面实况框，并通过我的测试VOC数据集来评估检测的准确性。

所以我想我看一下https://github.com/tensorflow/models/tree/master/research/object_detection/utils我认为我确实找到了一个函数（def draw_bounding_box_on_image_array）来为我的image_np制作一些框，但是没有发生这样的事情，所以这就是API用来做一些可视化的：

vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      np.squeeze(boxes),
      np.squeeze(classes).astype(np.int32),
      np.squeeze(scores),
      category_index,
      use_normalized_coordinates=True,
      line_thickness=2)

这就是我尝试使用的：

vis_util.draw_bounding_box_on_image(
        image_np,
        bndbox_coordinates[0][1],
        bndbox_coordinates[0][0],
        bndbox_coordinates[0][3],
        bndbox_coordinates[0][2])

但如果我试图绘制这个numpy数组图像

，那么就没有框

我错过了什么吗？问题2是API中有哪些类正在进行准确性评估？我不会看到我的干眼......这个类/功能是否使用PASCAL VOC来确定？ Mybe我可以使用它：https://github.com/tensorflow/models/blob/master/research/object_detection/utils/object_detection_evaluation.py但我不自信，因为我也是python的新手，而且我很难理解一些代码/注释......

也许你那里的专业人士可以帮助我

提前致谢

修改

我从这篇文章中读到了一些内容： https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/

现在我知道我需要一个IoU（UNection over Union） - 所以有人知道对象检测API是否为此提供了功能吗？我将再次关注API ...

Answer 1

我觉得你没有传递完整的参数

vis_util.visualize_boxes_and_labels_on_image_array(
  image_np,
  np.squeeze(boxes),
  np.squeeze(classes).astype(np.int32),
  np.squeeze(scores),
  category_index,
  use_normalized_coordinates=True,
  line_thickness=2)

你需要通过 image_np=ImageID，np.squeeze(boxes)=bounding box coordinates，np.squeeze(classes).astype(np.int32)=to which class this object belongs to，np.squeeze(scores)=confidence score that will always be 1

Answer 2

如果您只想使用一些老式的Python代码，则可以利用TensorFlow的对象检测utils文件夹中的一些有用功能：[https://github.com/tensorflow/models/blob/master/research/object_detection/utils/visualization_utils.py]

然后可以使用两个框的[y_min, x_min, y_max, x_max]坐标在原始图像上叠加地面真实框和预测框。进一步研究object_detection_tutorial.ipynb，以检出load_image_into_numpy_array功能。例如，要显示坐标为[90, 42, 125, 87]的地面真理框，您可以执行以下操作：

    from PIL import Image
    from utils import visualization_utils as vis_util
    from matplotlib import pyplot as plt
    import numpy as np

    def load_image_into_numpy_array(image): #adopted from object_detection_tutorial.ipynb
      (im_width, im_height) = image.size
      return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)


    image = Image.open(image_path)
    plt.figure(figsize=IMAGE_SIZE)
    image_np = load_image_into_numpy_array(image) # numpy array with shape [height, width, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)

    '''
        <xmin>42</xmin>
        <ymin>90</ymin>
        <xmax>87</xmax>
        <ymax>125</ymax>
    '''

    vis_util.draw_bounding_box_on_image_array(image_np, 90, 42, 125, 87, color='red', thickness = 2, use_normalized_coordinates = False)
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image_np)
    plt.show()

TF - 使用地面实况框进行物体检测

2 个答案: