Question

我一直在Linux Mint Cinnamon上使用Tesseract进行手写识别项目。默认情况下，Tesseract无法很好地阅读手写内容，因此我决定训练自己的神经网络并将其实现到Tesseract中，但是它无法读取我的训练数据。

根据我从此问题中学到的内容： How to use trained data with pytesseract? 我尝试将受过训练的数据移动到/home/usr/tesseract-ocr/4.00/tessdata/，这是所有其他.traineddata文件所在的位置。我的训练数据文件是Binary (application/octet-stream)文件，就像Tesseract提供的.traineddata文件一样。我还在代码pytesseract.image_to_string(imagePath, lang = 'MyTrainedData')中加入了该语言，但它仍然给我错误：

File "/home/usr/pytesseract/pytesseract.py", line 194, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file /usr/share/tesseract-
ocr/4.00/tessdata/MyTrainedData.traineddata Please make sure the 
TESSDATA_PREFIX environment variable is set to your 
"tessdata" directory. Failed loading language \'MyTrainedData\' 
Tesseract couldn\'t load any languages! Could not initialize 
tesseract.')

我如何在Tesseract中使用训练有素的数据？

谢谢。

编辑：

我还尝试将训练有素的数据放入/usr/share/tesseract-ocr/4.00/tessdata，但这仍然行不通。如果我从该文件夹中删除了eng.traineddata文件，则语言eng停止工作，但是仍然无法正确读取我的训练数据。

如何将自己训练有素的数据与Tesseract（pytesseract）结合使用？

0 个答案: