是否可以将文本图像转换为文本文件? JAVA

时间:2014-08-28 02:37:09

标签: java ocr

是否可以将文本图像转换为文本文件?

什么是使这种想法成为可能的正确算法?

我对这些东西很新,我想扩大我对这件事的学习,因为我还是学生。

1 个答案:

答案 0 :(得分:4)

我使用mseOCR。我将分享我的例子以供参考。有关详细信息,请here

让我们开始扫描下面的图像。图像文件包含大部分字符。确保将ascii.png添加到classpath。

ascii.png

enter image description here

ImageScanner.java

import java.awt.Image;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;

import javax.imageio.ImageIO;

import net.sourceforge.javaocr.ocrPlugins.mseOCR.CharacterRange;
import net.sourceforge.javaocr.ocrPlugins.mseOCR.OCRScanner;
import net.sourceforge.javaocr.ocrPlugins.mseOCR.TrainingImage;
import net.sourceforge.javaocr.ocrPlugins.mseOCR.TrainingImageLoader;
import net.sourceforge.javaocr.scanner.PixelImage;

public class ImageScanner {
    public static void main(String[] args) throws Exception {
        OCRScanner scanner = new OCRScanner();
        TrainingImageLoader loader = new TrainingImageLoader();
        HashMap<Character, ArrayList<TrainingImage>> trainingImageMap = new HashMap<Character, ArrayList<TrainingImage>>();
        loader.load("ascii.png", new CharacterRange('!', '~'), trainingImageMap);
        scanner.addTrainingImages(trainingImageMap);

        Image image = ImageIO.read(new File("ascii.png"));
        PixelImage pixelImage = new PixelImage(image);
        pixelImage.toGrayScale(true);
        pixelImage.filter();

        String text = scanner.scan(image, 0, 0, 0, 0, null);
        System.out.println(text);
    }
}

输出

!"#$%&' ()*+,-
./0123456789:;<=>?@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
wxyz{ | }~  

javaocr-core-1.0.jar    
javaocr-plugin-awt-1.0.jar
javaocr-plugin-cluster-1.0.jar
javaocr-plugin-fir-1.0.jar
javaocr-plugin-moment-1.0.jar
javaocr-plugin-morphology-1.0.jar