Question

问题。

我想屏幕上一个流行的智能手机游戏，以便从Android VM图像上运行的游戏实例中获取Gold，Elixir和Dark Elixir值。

然而，tesseract成功标记了一些样本，但拒绝成功标记其他样本。使用在线OCR测试相同的样本会返回正结果。

我使用标准的英语训练数据，并训练Tesseract使 Supercell-Magic 字体提高准确度约30％。

样品

gold_sample_1

gold_sample_1_processed

magick gold_sample_1.png -fill Black +opaque "#fffbcc" -fill White -opaque "#fffbcc" gold_sample_1_processed.png

输出

40 494

gold_sample_3

gold_sample_3_processed

magick gold_sample_3.png -fill Black +opaque "#ffffff" -fill White -opaque "#ffffff" gold_sample_3_processed.png

输出

There is nothing in the output file

然而，将相同内容上传到online OCR会给我这样的信息：

功能

OS。

Windows 7 x64 SP1

我的Win7仍然没有像世界上许多人一样升级自己的忍者风格;）

Tesseract OCR。

tesseract 3.05.00dev
leptonica-1.73 (Feb  5 2016, 01:13:58) [MSC v.1900 LIB Release x86]
libgif 5.1.2 : libjpeg 9 : libpng 1.6.19 : libtiff 4.0.2 : zlib 1.2.8 : libwebp 0.3.1.

Image Magick。

Version: ImageMagick 7.0.2-1 Q8 x86 2016-06-23 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP
Delegates (built-in): bzlib cairo flif freetype jng jp2 jpeg lcms lqr openexr pangocairo png ps rsvg tiff webp xml zlib

Answer 1

<强>解决！

明确指定psm模式。

tesseract --help-psm
Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.

图片：

和命令：

tesseract gold_sample_3_processed.png sample3 -l eng2 -psm 8

给出输出：

无论如何感谢网络陌生人。

Tesseract没有返回一致的结果

1 个答案: