Question

我有大量要解析的文件。

它们看起来像这些：见例子：

嗯，我想使用Image :: OCR :: Tesseract可能很有趣。我想我用Tesseract（http://search.cpan.org/~leocharre/Image-OCR-Tesseract-1.24/lib/Image/OCR/Tesseract.pod）

解析这个问题

use Image::OCR::Tesseract 'get_ocr';

my $image = './hi.jpg';

my $text = get_ocr($image);

这是正确的语法吗？

Answer 1

您可以下载并编译最新版本的tesseract。然后，您可以编写一个（shell或Perl）脚本来提供所有文件以进行解析。