为什么在iOS设备和模拟器上运行时,tesseract OCR识别质量会降低

时间:2016-02-08 19:03:56

标签: ios swift tesseract

简单地说,当我在iPhone 6模拟器上运行代码时,我通过TesseractOCR接近完美的文本识别,但是当我在实际的iPhone 6上运行相同的代码时,识别性能大幅下降。在设备上,一些降级发生靠近非文本工件(例如,线条)。在另一种情况下,对于相同的字母数字串,存在一致的单个字符识别错误。有些文字是粗体,有些则不是。有几种字体大小。

所以我的问题是:有没有人知道为什么识别质量在模拟器上更好?如果有什么可以改善iPhone上的识别,我该怎么办?

我已经构建了一个简单的iOS测试应用程序,可以使用TesseractOCRiOS将图像转换为文本。最终,图像将来自设备相机,但是现在我只使用格式化为jpeg的发票的测试图像。该过程如下所示:

jpeg->UIImage->NSData->UIImage->Tesseract

我假设这个过程是无损的。 NSData的中间转换是整个过程的一部分,因为设备相机产生图像数据,而照片库存储图像数据 - 此时使用jpeg测试图像使得过程看起来有点复杂。

要安装TesseractOCRiOS,请按照wiki上的说明进行操作。我使用以下pod文件安装了TesseractOCRiOS 4.0.0。

platform :ios, '9.0'
use_frameworks!
workspace 'DocumentCaptureV2'
xcodeproj 'DocumentCaptureDemo/DocumentCaptureDemo.xcodeproj/'

target 'DocumentCaptureDemo' do
   pod 'TesseractOCRiOS', '4.0.0'
end

target 'DocumentCaptureDemoTests' do
   pod 'TesseractOCRiOS', '4.0.0'
end

target 'DocumentCaptureDemoUITests' do
   pod 'TesseractOCRiOS', '4.0.0'
end

post_install do |installer|
   installer.pods_project.targets.each do |target|
       target.build_configurations.each do |config|
           config.build_settings['ENABLE_BITCODE'] = 'NO'
       end
   end
end

添加了安装后位以使其在iPhone 6设备上运行。没有这些pod文件的东西(需要一些搜索来找到这一点魔法),当尝试在设备上运行它时链接失败(不能使用弱库并启用bitcode)。

我用来识别文本的代码如下(基本上是github上的示例代码)。我确实将识别时间从1改为10,认为iPhone可能需要更多时间。它没有什么区别。

func updatePageText(page: Page, image: UIImage?) {
    if image != nil {
        // Create a new `G8RecognitionOperation` to perform the OCR asynchronously
        // It is assumed that there is a .traineddata file for the language pack
        // you want Tesseract to use in the "tessdata" folder in the root of the
        // project AND that the "tessdata" folder is a referenced folder and NOT
        // a symbolic group in your project
        let operation = G8RecognitionOperation(language: "eng")

        // Use the original Tesseract engine mode in performing the recognition
        // (see G8Constants.h) for other engine mode options
        operation.tesseract.engineMode = G8OCREngineMode.TesseractOnly;

        // Let Tesseract automatically segment the page into blocks of text
        // based on its analysis (see G8Constants.h) for other page segmentation
        // mode options
        operation.tesseract.pageSegmentationMode = G8PageSegmentationMode.AutoOnly;

        // Optionally limit the time Tesseract should spend performing the
        // recognition
        //operation.tesseract.maximumRecognitionTime = 10.0;

        // Set the delegate for the recognition to be this class
        // (see `progressImageRecognitionForTesseract` and
        // `shouldCancelImageRecognitionForTesseract` methods below)
        operation.delegate = self;

        // Optionally limit Tesseract's recognition to the following whitelist
        // and blacklist of characters
        //operation.tesseract.charWhitelist = @"01234";
        //operation.tesseract.charBlacklist = @"56789";

        // Set the image on which Tesseract should perform recognition
        operation.tesseract.image = image;

        // Optionally limit the region in the image on which Tesseract should
        // perform recognition to a rectangle
        //operation.tesseract.rect = CGRectMake(20, 20, 100, 100);

        // Specify the function block that should be executed when Tesseract
        // finishes performing recognition on the image

        operation.recognitionCompleteBlock = {(tesseract: G8Tesseract!)->Void in
            page.text = tesseract.recognizedText.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
            Page.save()
        }

        // Finally, add the recognition operation to the queue
        operationQueue.addOperation(operation);
    }

}

我假设我传递给updatePageText的UIImage已经完成(即,仍然没有加载)。我看过iPhone 6上的阈值图像,我看不出任何奇怪的东西。几乎没有关于tesseract配置的文档,所以我没有在那里尝试过任何东西。

0 个答案:

没有答案