Question

我已经使用Tensorflow对象检测API来检测图像中的手。通过使用提供的示例代码（object_detection_tutorial.ipynb），我已经能够在图像上绘制边框。有什么方法可以只选择检测到的区域（位于边界框内）并将其作为图像获取？

例如，

样本输入图像

tensorflow输出

我想要什么

对象检测API示例代码可在此处找到。 https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

任何帮助将不胜感激！

Answer 1

是的，在本教程中，变量output_dict可用于实现该目标。请注意，所有传递给函数vis_util.visualize_boxes_and_labels_on_image_array的变量都包含框，分数等。

首先，您需要获取图像形状，因为框坐标为标准化形式。

img_height, img_width, img_channel = image_np.shape

然后将所有方框坐标转换为绝对格式

absolute_coord = []
THRESHOLD = 0.7 # adjust your threshold here
N = len(output_dict['detection_boxes'])
for i in range(N):
    if output_dict['score'][i] < THRESHOLD:
        continue
    box = output_dict['detection_boxes']
    ymin, xmin, ymax, xmax = box
    x_up = int(xmin*img_width)
    y_up = int(ymin*img_height)
    x_down = int(xmax*img_width)
    y_down = int(ymax*img_height)
    absolute_coord.append((x_up,y_up,x_down,y_down))

然后，您可以使用numpy slices获取边界框中的图像区域

bounding_box_img = []
for c in absolute_coord:
    bounding_box_img.append(image_np[c[1]:c[3], c[0]:c[2],:])

然后只需将bounding_box_img中的所有numpy数组另存为图像。保存时，可能需要更改形状，因为img的形状为[img_height，img_width，img_channel]。如果使用分数数组，您甚至还可以过滤掉所有置信度较低的检测结果。

PS：我可能已经把img_height和img_width弄混了，但是这些应该给您一个起点。

在Python中从图像中裁剪并仅选择检测到的区域

样本输入图像

tensorflow输出

我想要什么

1 个答案: