Question

如何将图像裁剪到Tensorflow中的边界框？我正在使用Python API。

从文档中

tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)

将图像裁剪到指定的边界框。

此操作会从图像中剪切出矩形部分。返回图像的左上角位于offset_height，offset_width位于图像中，其右下角位于offset_height + target_height，offset_width + target_width。

我可以在标准化坐标中得到边界框的坐标，

    ymin = boxes[0,i,0]
    xmin = boxes[0,i,1]
    ymax = boxes[0,i,2]
    xmax = boxes[0,i,3]

并将它们转换为绝对坐标，

    (xminn, xmaxx, yminn, ymaxx) = (xmin * im_width, xmax * im_width, ymin * im_height, ymax * im_height)

但是我无法弄清楚如何在crop_to_bounding_box函数中使用这些坐标。

Answer 1

由于我们将x视为水平，y视为垂直，因此以下方法会使用指定的框裁剪图像。

cropped_image = tf.image.crop_to_bounding_box(image, yminn, xminn, 
                                       ymaxx - yminn, xmaxx - xminn)