Question

我正在尝试使用OCR（光学字符重组）。我有一个示例图像，我想从中读取数据。下面是我的示例图像文件。

我使用tess4j API从图片中读取文字。请找到下面的代码。

public static String crackImage(String filePath) {
        File imageFile = new File(filePath);
        ITesseract instance = new Tesseract();
        instance.setLanguage("eng");
        try {
            String result = instance.doOCR(imageFile);
            return result;
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
            return "Error while reading image";
        }
    }
    public static void main(String[] args) {
       String results = crackImage("D:\\data\\testImage.PNG");
       System.out.print(results);
    }

以下是我在pom.xml文件中的依赖关系。

    <dependencies>
        <dependency>  
            <groupId>net.sourceforge.tess4j</groupId>  
            <artifactId>tess4j</artifactId>  
            <version>3.2.1</version>  
        </dependency>
    </dependencies>

我在项目目录中创建了tessdata\eng.traineddata结构。

当我运行代码时。它工作正常，但我得到了一些错误的结果（可能是不同的语言），如下所示。

Creale a Voumhe metauzoa mwwer usmg szz

我不确定，为什么这个文本会被打印出来，即使我明确地将语言设置为ENGLISH。有人可以帮我解决这个问题。

net.sourceforge.tess4j在从图像中读取数据时会抛出错误的结果

0 个答案: