我尝试在Tess4j中制作OCR,并有2个示例文件(一个是另一个文件):
File #1: https://i.ibb.co/gvDDr4X/TK-Banquet-Catering-Menu-10-15-19-P2.jpg
File #2: https://i.ibb.co/vQrP4N7/215215.jpg
当我尝试制作文件#1的OCR时,它具有输出:
...
BUFFET OR PLATED MENU OPTIONS
umamwymmw hm,” Wimumm mmmammm “Wm";
mum "me meMmWnfl.“ mummmmmwmmm.” WM“.
mmmMme.mmsmmmsmm
BUFFET BUFFET ADDITIONAL OPTIONS
5,5 us
...
当我将文件#2中无法识别的部分切成两倍并放大时,效果会更好:
Here area few lypicaLrnenus below and their cost per person. Please let me know “you have any questions
or would Like to see any adjustments. We cfora FuLl Servlos Cocktall Bar to enhance- wur dining experience.
in addition Iawine. beerand specialty drinks. Best. Terry
如果我将整个#1的大小加倍,则#2部分的输出仍然像“ umamwymmw hm” ...
如何使Tess4j使用较小的字体更好?
当前代码块为:
java.io.File imageFile = new java.io.File(msg.getChatId() + " " + lastpart);
ITesseract instance = new Tesseract();
instance.setDatapath("D://OCR");
try {
ImageIO.scanForPlugins();
String result = instance.doOCR(imageFile);
System.out.println(result);
sendMsg(msg, result);
} catch (TesseractException ex) {
System.err.println(ex.getMessage());
}