Question

如何从包含英文文本的图像中获取正确的数值。我正在使用tesseract引擎。

这是代码：

    public static String tesseractOCR(String imgPath, Rectangle rect) {
    File imageFile = new File(imgPath);
    Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
    // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
    String result = "";
    try {
        result = instance.doOCR(imageFile, rect);//, new Rectangle(50, 128, 405 - 50, 228 - 128)
    } catch (TesseractException e) {
        System.err.println(e.getMessage());
    }
    return result;
}

Answer 1

Tesseract将抓取图像中的所有字符，包括字母，数字，标点符号等。因此，您需要显式删除提取文本中的非数字值。您可以使用正则表达式。

Tesseract只读取文本中的数字

1 个答案: