Question

我正在使用人工神经网络编写手写识别系统，但我遇到了问题：我想在扫描图像上分离字符并获得每个字符的AABB（我不想在图像上绘制但只计算它）

可以假设字符只是黑色而背景只有白色（我已经编写过阈值算法）

std::vector < unsigned char > px; // pixel data (RGBARGBARGBARGBA...)
unsigned w, h; // width and height of image

lodepng::decode( px, w, h, infile ); // i use LodePNG to decode image

for( int i = 0; i < px.size(); i += 4 )
{
    unsigned char & r = px[ i ], & g = px[ i + 1 ], & b = px[ i + 2 ], & a = px[ i + 3 ];

    // and what now?
}

lodepng::encode( outfile, px, w, h );

Image of problem（对不起，但我没有足够的代表发布图片:(）

Answer 1

图片中显示的图片处理任务称为＆＃34;细分＆＃34;。有很多方法可以做到这一点。最简单的方法是选择第一个黑色像素（最左上角）并检查右侧（x + = 1）或其正下方的 ~~4个像素是否有任何一个（y + = 1，x + = { - 1,0），1}）也是黑色的。~~ 8个相邻像素也是黑色。将这些添加到属于同一个字母的像素集并递归。要防止无限递归，您应该只检查在前一次递归中添加的点的邻居，而不是两次添加点。您可以通过创建输入大小的空白画布并将像素值设置为您找到它的迭代来跟踪已添加的点。因此，第一个像素获得值1，其邻居获得值2，邻居的邻居获得值3，等等。

在某个时刻，您找到了字母的左下角像素，因此您可以在输入中删除它们。这确保了新的左上角黑色像素属于第二个字母。

Axis Aligned Bounding Box现在只是一个字母所有像素的最小/最大x / y。

Answer 2

我使用不同的算法（感谢MSalters提供这个想法）。也许它可以帮助某人，所以我为此给出了伪代码。（我测试了它）

Copy image to image2
for each(Pixel p in image2)
{
  if(p is black)
  {
    Add p to container
    Set p color to white
    Call findNeighbours(p position)

    left top of aabb = (lowest x of pixels in container, lowest y of pixels in container)
    right down of aabb = (highest x of pixels in container, highest y of pixels in container)
    Save this aabb
    Clear container
  }
}
All objects found, all pixels should be white

function findNeighbours(x, y)
{
  for each(neighbour of pixel (x, y))
  {
    if(this neighbour is black)
    {
      Set this neighbour's color to white
      Add this neighbour's position to container
      Call findNeighbours(this neighbour's position)
    }
  }
}

计算图像上许多物体的AABB

2 个答案: