Question

我正在使用Tesseract Library从MRZ（机器可读区）图像读取信息。我尝试了一些google images并且我得到了很好的结果。但是当我去实时图像时，就是捕获图像的时候从iphone相机，我没有得到好结果。

以下谷歌图片获得了良好的效果

enter image description here

上图的图片尺寸

这是google image.Size是543x83。

当我从iphone拍摄图像时，OCR表现不佳

enter image description here

以上图片详情。

从Iphone捕获的图像。图像大小为2205x268

1.如何获得上述实时图像的良好效果。？

2. Tesseract OCR需要推荐的图像尺寸吗？

Answer 1

我已经使用ImageMagick取得了一些成功 - 它是免费的，可用于here的OSX，Windows和Linux。很难找到通用参数，这需要相当多的摆弄：

#!/bin/bash

# Enhance image as much as possible for Tesseract OCR
convert input.jpg -normalize  \( -clone 0 -colorspace gray -negate -lat 50x50+10% -contrast-stretch 0 -blur 1x65535 -level 50x100% \) -compose copy_opacity -composite -opaque none -background white -adaptive-blur 3 out.jpg

# OCR the image and cat the results
tesseract out.jpg p && cat p.txt

OCR文字输出

IDFRADOUEL<<<<<<<<<<<<<<<<<<<<932013
U506932020438CHRISTIANE<<NI2906209F3

这是由上面的OCR命令准备的图像：

enter image description here

如何设置图像大小以改善OCR输出。

1 个答案: