Question

我正在使用此ocr算法http://sourceforge.net/projects/javaocr/来检测图像中的数字。我尝试过使用tesseract，但我有完全相同的问题，有时它不起作用。这从来没有奏效过（java ocr）。当我使用java ocr时，它没有产生任何输出但是/ n。

图像为白色，数字为黑色。图像中唯一的工件是靠近顶部和底部边框的两条线，甚至不会与字符相互影响。对齐是正常的，就像打印文本一样，它不是手写或倾斜。

BufferedImage image2 = ImageIO.read(new File("moneyImage"+".bmp"));
ImageManipulator.show(image2, 5);
OCRScanner scanner = new OCRScanner();
String items = scanner.scan(image2, 0, 0, 0, 0, null);
System.out.println(items);

图像2清晰显示，此示例取自发布它的其他人。我没有做任何复杂的事情，这对我来说没有意义，为什么这不起作用。这是一个简单的灰度图像。

当我尝试运行独立程序（java ocr one）时，它可以工作并生成正确的数字作为输出。我不知道如何从我的java项目中提取字符以及它为什么不起作用。

我的测试图片是： Test image

另外，这个

String lastText = null;
Tesseract instance = Tesseract.getInstance();
try {
    lastText = instance.doOCR(imageFile);
} catch (TesseractException ex) {
    Logger.getLogger(ActionAbstraction.class.getName()).log(Level.SEVERE, null, ex);
}

完全没有输出，即使我给出了一个数字的图片，从java ocr输出。它们似乎有效，但是当我进行实际扫描时，它们都不会输出任何内容。

另外，我正在使用tiff图像，正如我之前所说，字符提取工作正常。什么是行不通的是java代码调用图像上的扫描。我已经链接了相应的库（或者它会产生编译器错误）

Answer 1

不确定：但您是否告诉扫描仪只是用这条线查看图像的左上角：

String items = scanner.scan(image2, 0, 0, 0, 0, null);

可能将其更改为（类似）：

String items = scanner.scan(image2, 0, 0, 80, 20, null);

[将80,20更改为您的图像的宽度/高度 - 您可以让Java为您执行此操作 - 我认为如果我没记错的话，Image类中有一个方法。）

我从源代码的git克隆中得到了这个（也许是错误的）想法：

git clone git://git.code.sf.net/p/javaocr/source javaocr-source

在＆＃34; javaocr-source \ core \ src \ main \ java＆＃34;目录：包含在＆＃39; java.net.sourceforge.javaocr.ImageScanner.java中的界面＆＃39;定义扫描＆＃39;界面如下：

//

void scan(
            Image image,
            DocumentScannerListener listener,
            int left,
            int top,
            int right,
            int bottom);
}

//

Answer 2

这是我发现的函数扫描到项目源代码中的javadoc：

 /**
 * Scan an image and return the decoded text.
 * @param image The <code>Image</code> to be scanned.
 * @param x1 The leftmost pixel position of the area to be scanned, or
 * <code>0</code> to start scanning at the left boundary of the image.
 * @param y1 The topmost pixel position of the area to be scanned, or
 * <code>0</code> to start scanning at the top boundary of the image.
 * @param x2 The rightmost pixel position of the area to be scanned, or
 * <code>0</code> to stop scanning at the right boundary of the image.
 * @param y2 The bottommost pixel position of the area to be scanned, or
 * <code>0</code> to stop scanning at the bottom boundary of the image.
 * @param acceptableChars An array of <code>CharacterRange</code> objects
 * representing the ranges of characters which are allowed to be decoded,
 * or <code>null</code> to not limit which characters can be decoded.
 * @return The decoded text.
 */

所以

String items = scanner.scan(image2, 0, 0, 0, 0, null);

根据代码文档，

似乎没问题。但是我试过了，事实并非如此。这是我见过的最糟糕的文件之一。

Java OCR不会产生任何输出

2 个答案: