Question

我实现了一个基于MNIST dataset来识别手写数字的神经网络。我正在使用裸python / numpy，现在我想在网络上测试自己的手写图像。但是，我想自动化裁剪和缩放过程，以便提供智能手机拍摄的图像并获得mnist格式的numpy数组。

到目前为止，我已经取得了一些成功，但我真的不知道如何从这里开始。这是两个示例图像，分别位于各自的蒙版图像下方，它们是原始图像大小的一半，可缩小搜索范围：

image of a 4 image of a 7

如您所见，发生了一些事情，但这并不令人满意。如果我已经完美地分割并掩盖了“ 4”和“ 7”，我也不知道如何进行。 您如何获得精确的位置，以便我可以将其裁剪和缩小为28x28像素？

产生这些图像的代码如下。它基本上计算出x和y像素空间轴的空间直方图，然后将所有不包含足够黑色的东西都涂黑。 plot（）和hist（）只是便利功能，但确实会产生您看到的图像，因此我将它们包括在内。

body: SingleChildScrollView(
  child: Column(
    mainAxisSize: MainAxisSize.min,
    children: <Widget>[
      Text(
        'Headline',
        style: TextStyle(fontSize: 18),
      ),
      SizedBox(
        height: 200.0,
        child: ListView.builder(
          physics: ClampingScrollPhysics(),
          shrinkWrap: true,
          scrollDirection: Axis.horizontal,
          itemCount: 15,
          itemBuilder: (BuildContext context, int index) => Card(
                child: Center(child: Text('Dummy Card Text')),
              ),
        ),
      ),
      Text(
        'Demo Headline 2',
        style: TextStyle(fontSize: 18),
      ),
      Card(
        child: ListTile(title: Text('Motivation $int'), subtitle: Text('this is a description of the motivation')),
      ),
      Card(
        child: ListTile(title: Text('Motivation $int'), subtitle: Text('this is a description of the motivation')),
      ),
      Card(
        child: ListTile(title: Text('Motivation $int'), subtitle: Text('this is a description of the motivation')),
      ),
      Card(
        child: ListTile(title: Text('Motivation $int'), subtitle: Text('this is a description of the motivation')),
      ),
      Card(
        child: ListTile(title: Text('Motivation $int'), subtitle: Text('this is a description of the motivation')),
      ),
    ],
  ),
),

因此，我想得出一个手写的数字的正方形图像，大约在图片的中间显示。为此，获得位数的中间值可能就足够了，但是我想不出一种简单且半可靠的方式（不必是生产级的）。

如何从图像中分离出手写数字候选者？

0 个答案: