我希望每个人都做得很好。您能帮我解决以下问题吗?我相信pytorch,opencv等中没有这样的功能,所以我正在尝试手动实现。
上下文:我收集了一批帧并将其推送到GPU,并保持在那里。然后,我有几个网络(YOLO)。为了使它起作用,我需要在GPU上调整图像的大小,使其达到YOLO期望的尺寸,以保持纵横比(有填充用灰色填充图像的其余部分)。按照此处所述完成操作:https://discuss.pytorch.org/t/resize-images-torch-tensor-already-loaded-on-gpu-keeping-their-aspect-ratio/83172/9
我现在需要做的是:重新调整边界框的坐标,使其相对于帧的原始大小,而不是调整给网络进行检测的尺寸。这就是我现在正在做的事情: code
@staticmethod
def rescale_bounding_box(detections: dict, current_dim: int, original_shape: tuple) -> dict:
"""
Rescale bounding boxes predicted for image(s) of the size current_dim to the original_shape
:param detections:
:param current_dim:
:param original_shape:
:return:
"""
original_h, original_w = original_shape
# Added padding (check if the image was tall of short)
pad_x = max(original_h - original_w, 0) * (current_dim / max(original_shape))
pad_y = max(original_w - original_h, 0) * (current_dim / max(original_shape))
# Image size after padding's been removed
unpad_h = current_dim - pad_y
unpad_w = current_dim - pad_x
output = dict()
for img_batch_index, predictions in detections.items():
output[img_batch_index] = list()
# On each image in the batch there's a number of detections that need to be rescaled
for prediction in predictions:
new_left = int((prediction[0] - pad_x // 2) * (original_w / unpad_w))
new_top = int((prediction[1] - pad_y // 2) * (original_h / unpad_h))
new_right = int((prediction[2] - pad_x // 2) * (original_w / unpad_w))
new_bot = int((prediction[3] - pad_y // 2) * (original_h / unpad_h))
obj_score = round(prediction[4], 4)
conf = round(prediction[5], 4)
index = int(prediction[6])
print("Rescaled:", new_left, new_top, new_right, new_bot)
# Save modified results
assert all((new_left < new_right, new_top < new_bot)), "Coordinates rescaled wrong"
assert all((new_left > 0, new_top > 0, new_right > 0, new_bot > 0)), "Coordinates rescaled wrong"
assert all((new_right <= original_w, new_bot <= original_h)), "Coordinates rescaled wrong"
output[img_batch_index].append(
[new_left, new_top, new_right, new_bot, obj_score, conf, index]
)
return output
很遗憾,它没有传递assert语句。我的new_top值通常为<0,new_right可以超出右边缘。
detections-是字典,其中key-批次中图像的索引(对于图像为1),values-列表列表。每个嵌套列表都是网络预测的检测结果。
如果有人可以,请帮我一下。花了很多时间,没有成功。
谢谢。