Question

I am using the synth text dataset, and the word level bounding box annotation are given in the form of 4 points, here is what the doc says

               - the first dimension is 2 for x and y respectively,
               - the second dimension corresponds to the 4 points
                 (clockwise, starting from top-left), and

so they have given the points, ymin, ymax, xmin,max.

These are 4 points, which correspond to top-left and bottom right.

But the network i am trying to train takes in 8 points as input,

x1,y1,x2,y2,x3,y3,x4,y4

Is there a way to go from my 4 points to 8 points.

Thanks in advance.

Answer 1

使用此方法（假设坐标按顺时针方向获取）：

x1 = top_left['x']
y1 = top_left['y']
x2 = bottom_right['x']
y2 = top_left['y']
x3 = bottom_right['x']
y3 = bottom_right['y']
x4 = top_left['x']
y4 = bottom_right['y']

这个想法很简单：第二个点的x坐标更改为右下角的坐标，第四个点的y坐标更改为右下角的坐标。第三点与右下角的坐标相同。

Bounding box annotation,going from 4 points to 8

1 个答案: