简单地说,当我在iPhone 6模拟器上运行代码时,我通过TesseractOCR接近完美的文本识别,但是当我在实际的iPhone 6上运行相同的代码时,识别性能大幅下降。在设备上,一些降级发生靠近非文本工件(例如,线条)。在另一种情况下,对于相同的字母数字串,存在一致的单个字符识别错误。有些文字是粗体,有些则不是。有几种字体大小。
所以我的问题是:有没有人知道为什么识别质量在模拟器上更好?如果有什么可以改善iPhone上的识别,我该怎么办?
我已经构建了一个简单的iOS测试应用程序,可以使用TesseractOCRiOS将图像转换为文本。最终,图像将来自设备相机,但是现在我只使用格式化为jpeg的发票的测试图像。该过程如下所示:
jpeg->UIImage->NSData->UIImage->Tesseract
我假设这个过程是无损的。 NSData的中间转换是整个过程的一部分,因为设备相机产生图像数据,而照片库存储图像数据 - 此时使用jpeg测试图像使得过程看起来有点复杂。
要安装TesseractOCRiOS,请按照wiki上的说明进行操作。我使用以下pod文件安装了TesseractOCRiOS 4.0.0。
platform :ios, '9.0'
use_frameworks!
workspace 'DocumentCaptureV2'
xcodeproj 'DocumentCaptureDemo/DocumentCaptureDemo.xcodeproj/'
target 'DocumentCaptureDemo' do
pod 'TesseractOCRiOS', '4.0.0'
end
target 'DocumentCaptureDemoTests' do
pod 'TesseractOCRiOS', '4.0.0'
end
target 'DocumentCaptureDemoUITests' do
pod 'TesseractOCRiOS', '4.0.0'
end
post_install do |installer|
installer.pods_project.targets.each do |target|
target.build_configurations.each do |config|
config.build_settings['ENABLE_BITCODE'] = 'NO'
end
end
end
添加了安装后位以使其在iPhone 6设备上运行。没有这些pod文件的东西(需要一些搜索来找到这一点魔法),当尝试在设备上运行它时链接失败(不能使用弱库并启用bitcode)。
我用来识别文本的代码如下(基本上是github上的示例代码)。我确实将识别时间从1改为10,认为iPhone可能需要更多时间。它没有什么区别。
func updatePageText(page: Page, image: UIImage?) {
if image != nil {
// Create a new `G8RecognitionOperation` to perform the OCR asynchronously
// It is assumed that there is a .traineddata file for the language pack
// you want Tesseract to use in the "tessdata" folder in the root of the
// project AND that the "tessdata" folder is a referenced folder and NOT
// a symbolic group in your project
let operation = G8RecognitionOperation(language: "eng")
// Use the original Tesseract engine mode in performing the recognition
// (see G8Constants.h) for other engine mode options
operation.tesseract.engineMode = G8OCREngineMode.TesseractOnly;
// Let Tesseract automatically segment the page into blocks of text
// based on its analysis (see G8Constants.h) for other page segmentation
// mode options
operation.tesseract.pageSegmentationMode = G8PageSegmentationMode.AutoOnly;
// Optionally limit the time Tesseract should spend performing the
// recognition
//operation.tesseract.maximumRecognitionTime = 10.0;
// Set the delegate for the recognition to be this class
// (see `progressImageRecognitionForTesseract` and
// `shouldCancelImageRecognitionForTesseract` methods below)
operation.delegate = self;
// Optionally limit Tesseract's recognition to the following whitelist
// and blacklist of characters
//operation.tesseract.charWhitelist = @"01234";
//operation.tesseract.charBlacklist = @"56789";
// Set the image on which Tesseract should perform recognition
operation.tesseract.image = image;
// Optionally limit the region in the image on which Tesseract should
// perform recognition to a rectangle
//operation.tesseract.rect = CGRectMake(20, 20, 100, 100);
// Specify the function block that should be executed when Tesseract
// finishes performing recognition on the image
operation.recognitionCompleteBlock = {(tesseract: G8Tesseract!)->Void in
page.text = tesseract.recognizedText.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
Page.save()
}
// Finally, add the recognition operation to the queue
operationQueue.addOperation(operation);
}
}
我假设我传递给updatePageText的UIImage已经完成(即,仍然没有加载)。我看过iPhone 6上的阈值图像,我看不出任何奇怪的东西。几乎没有关于tesseract配置的文档,所以我没有在那里尝试过任何东西。