Question

这是我在图像上可视化边界框的代码：

 viz_utils.visualize_boxes_and_labels_on_image_array(
  image_np_with_detections,
  detections['detection_boxes'][0].numpy(),
  (detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
  detections['detection_scores'][0].numpy(),
  category_index,
  use_normalized_coordinates=True,
  max_boxes_to_draw=200,
  min_score_thresh=.5,
  agnostic_mode=False,

 )

这是在检测后裁剪边界框：

width=600
height=900

boxes = detections['detection_boxes']
ymin = int((boxes[0][0][0]*height))
xmin = int((boxes[0][0][1]*width))
ymax = int((boxes[0][0][2]*height))
xmax = int((boxes[0][0][3]*width))
print ("xmin: {} ".format(xmin),"ymin: {}".format(ymin),"xmax: {}".format(xmax),"ymax: {}".format(ymax))

from PIL import Image
img = Image.open(image_path)
img2 = img.crop((xmin,xmax,ymin,ymax))
img2.save("/content/gdrive/MyDrive/UrduDetection/Croped_images/img8.jpg")

哪个是不正确的本地化裁剪。

如何获得检测到的边界框的正确裁剪图像？

Answer 1

PIL 的 crop() 函数不接受您提供的参数。你应该像 (left, top, right, bottom) 一样使用它，在你的情况下是：

img2 = img.crop((xmin,ymin,xmax,ymax))

或者您可以将图像作为 numpy 数组打开并使用索引对其进行裁剪。

img  = numpy.asarray(PIL.Image.open('test.jpg'))
img2 = img[ymin:ymax, xmin:xmax, ...]

编辑

我不知道你的可视化函数里面有什么，所以我不知道它是如何工作的或不工作的。但仅通过查看您的情节数字，我就可以看出 xmin-xmax 大致应该是 200-620 而不是 143-420。并且对于 ymin-ymax，它是围绕 510-560 而不是 779-856 的东西，您正在使用。你给出了一个不存在的 y 范围，因此你的黑色输出。

您可能以错误的方式转换了可视化函数之外的坐标，您的坐标可能是 xc,yc,w,h 并且您将其视为 xmin,xmax,ymin,ymax 。

裁剪不正确的边界框

1 个答案: