我正在尝试在Ubuntu 14.04上训练tesseract 3.02。我遵循了Cedric's blog中提到的指南。
首先,我尝试使用以下命令生成一个盒子文件:
tesseract eng.mr.exp0.jpg eng.mr.exp0 batch.nochop makebox
但是上面的命令生成一个单行的框文件,整个图像作为一个单独的框(实际上它应该生成一个包含6行的框文件)。所以,我使用jTessBoxEditor来编辑盒子文件并用适当的坐标和字符创建6个盒子。现在,当我尝试使用命令
训练带有上述创建的盒子文件的tesseract时tesseract eng.mr.exp0.jpg eng.mr.exp0.box nobatch box.train
我收到错误:
Tesseract Open Source OCR Engine v3.03 with Leptonica
FAIL!
APPLY_BOXES: boxfile line 1/0 ((20,24),(95,192)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 2/7 ((96,24),(171,192)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 3/0 ((172,24),(248,192)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 4/3 ((248,24),(324,192)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 5/3 ((324,24),(400,192)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 6/0 ((400,24),(476,192)): FAILURE! Couldn't find a matching blob
APPLY_BOXES:
Boxes read from boxfile: 6
Boxes failed resegmentation: 6
APPLY_BOXES: Unlabelled word at :Bounding box=(0,19)->(480,192)
Found 0 good blobs.
1 remaining unlabelled words deleted.
Generated training data for 0 words
我犯的错误是什么?
使用的图片是here