Question

我是人工智能的新手，我正在使用TensorFlow对象检测API在图像上检测产品，因此它已经在检测对象，但是我想获取对象中每个对象的坐标Xmax，Xmin，Ymax和Ymin。图片。

那是检测到物体的图像，在这种情况下，图像中检测到2个物体。

图片：

我们可以看到我得到了对象的坐标，但不清楚，输出中有3个以上的坐标，我只想获取坐标的数量作为图像中对象的数量。

这是提供输出的代码

with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')

        print(detection_graph.get_tensor_by_name('detection_boxes:0'))

        for image_path in TEST_IMAGE_PATHS:
            boxes = detect_objects(image_path)
            print(boxes)

输出

Tensor("detection_boxes:0", dtype=float32)
[[[0.16593058 0.06630109 0.8009524  0.5019088 ]
  [0.15757088 0.5376015  0.8869156  0.9394863 ]
  [0.5966009  0.88420665 0.6564093  0.9339011 ]
  ...
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]
  [0.         0.         0.         0.        ]]]

我想得到类似的东西，但是只有边界框的坐标。我们假设它们是对象的坐标。

[0.16593058 0.06630109 0.8009524  0.5019088 ]
[0.15757088 0.5376015  0.8869156  0.9394863 ]

Answer 1

You should be aware of two things:

These are all the coordinates of all (usually 100) top detections.
These are given in normalized coordinates.

Therefore, in order to filter the detections by score, use detection_scores in order to decide which indices to filter out (they're sorted), and you can multiple the normalized coordinates with the original image size in order to get the absolute coordinates. The normalized coordinates are given in the format of [ymin, xmin, ymax, xmax], therefore you should multiply the first and the third coordinates with y_size and the second and the fourth with x_size. You can compute x_size and y_size by evaluating the shape of image_tensor.

Answer 2

代码：

for box in boxes[0]:
    xmin, ymin, xmax, ymax =box
    bboxes.append([int(ymin *640),int(xmin*480) , int((ymax-ymin)*640), int((xmax-xmin)*480)])

如何从张量流对象检测中获取边界框[Xmax，Xmin，Ymax，Ymin]

2 个答案: