我已经创建了一个小型控制台应用程序来在.tiff图像文件上执行OCR,我使用tess4j完成了这个。
public class JavaApplication10 {
/**
* @param args the command line arguments
*/
public static void main(String[] args)
{
File imageFile = new File("C:\\Users\\Manesh\\Desktop\\license_plate.tiff");
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
// Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
try
{
String result = instance.doOCR(imageFile); //Empty result
System.out.println("hahahaha");
System.out.println("The result is: " + result);
}
catch (TesseractException e)
{
System.out.println("error:" + e);
}
}
}
我没有在结果中获得任何值,当我查看Tesseract类的代码并插入一些System.out.println时,那些也没有在控制台中打印出来。我的Tesseract代码如下所示。
public class Tesseract
{
private static Tesseract instance;
private final static Rectangle EMPTY_RECTANGLE = new Rectangle();
private String language = "eng";
private String datapath = "tessdata";
private int psm = TessAPI.TessPageSegMode.PSM_AUTO;
private boolean hocr;
private int pageNum;
private int ocrEngineMode = TessAPI.TessOcrEngineMode.OEM_DEFAULT;
private Properties prop = new Properties();
public final static String htmlBeginTag =
"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\""
+ " \"http://www.w3.org/TR/html4/loose.dtd\">\n"
+ "<html>\n<head>\n<title></title>\n"
+ "<meta http-equiv=\"Content-Type\" content=\"text/html;"
+ "charset=utf-8\" />\n<meta name='ocr-system' content='tesseract'/>\n"
+ "</head>\n<body>\n";
public final static String htmlEndTag = "</body>\n</html>\n";
private Tesseract()
{
System.setProperty("jna.encoding", "UTF8");
}
public static synchronized Tesseract getInstance()
{
if (instance == null)
{
instance = new Tesseract();
}
return instance;
}
public void setDatapath(String datapath)
{
this.datapath = datapath;
}
public void setLanguage(String language)
{
this.language = language;
}
public void setOcrEngineMode(int ocrEngineMode)
{
this.ocrEngineMode = ocrEngineMode;
}
public void setPageSegMode(int mode)
{
this.psm = mode;
}
public void setHocr(boolean hocr)
{
this.hocr = hocr;
prop.setProperty("tessedit_create_hocr", hocr ? "1" : "0");
}
public void setTessVariable(String key, String value)
{
prop.setProperty(key, value);
}
public String doOCR(File imageFile) throws TesseractException
{
System.out.println("hiiiiiii "); //not getting printed
return doOCR(imageFile, null);
}
public String doOCR(File imageFile, Rectangle rect) throws TesseractException
{
try
{
System.out.println("be: "); //not getting printed
return doOCR(ImageIOHelper.getIIOImageList(imageFile), rect);
}
catch (IOException ioe)
{
throw new TesseractException(ioe);
}
}
public String doOCR(BufferedImage bi) throws TesseractException
{
return doOCR(bi, null);
}
public String doOCR(BufferedImage bi, Rectangle rect) throws TesseractException
{
IIOImage oimage = new IIOImage(bi, null, null);
List<IIOImage> imageList = new ArrayList<IIOImage>();
imageList.add(oimage);
return doOCR(imageList, rect);
}
public String doOCR(List<IIOImage> imageList, Rectangle rect) throws TesseractException
{
StringBuilder sb = new StringBuilder();
pageNum = 0;
for (IIOImage oimage : imageList)
{
pageNum++;
try
{
ByteBuffer buf = ImageIOHelper.getImageByteBuffer(oimage);
RenderedImage ri = oimage.getRenderedImage();
String pageText = doOCR(ri.getWidth(), ri.getHeight(), buf, rect, ri.getColorModel().getPixelSize());
sb.append(pageText);
}
catch (IOException ioe)
{
//skip the problematic image
System.err.println(ioe.getMessage());
}
}
if (hocr)
{
sb.insert(0, htmlBeginTag).append(htmlEndTag);
}
return sb.toString();
}
public String doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) throws TesseractException
{
TessAPI api = TessAPI.INSTANCE;
TessAPI.TessBaseAPI handle = api.TessBaseAPICreate();
api.TessBaseAPIInit2(handle, datapath, language, ocrEngineMode);
api.TessBaseAPISetPageSegMode(handle, psm);
Enumeration em = prop.propertyNames();
while (em.hasMoreElements())
{
String key = (String) em.nextElement();
api.TessBaseAPISetVariable(handle, key, prop.getProperty(key));
}
int bytespp = bpp / 8;
int bytespl = (int) Math.ceil(xsize * bpp / 8.0);
api.TessBaseAPISetImage(handle, buf, xsize, ysize, bytespp, bytespl);
if (rect != null && !rect.equals(EMPTY_RECTANGLE))
{
api.TessBaseAPISetRectangle(handle, rect.x, rect.y, rect.width, rect.height);
}
Pointer utf8Text = hocr ? api.TessBaseAPIGetHOCRText(handle, pageNum - 1) : api.TessBaseAPIGetUTF8Text(handle);
String str = utf8Text.getString(0);
api.TessDeleteText(utf8Text);
api.TessBaseAPIDelete(handle);
return str;
}
}
我第一次使用tesseract请告诉我我做错了什么。
答案 0 :(得分:1)
对于Tesseract,您必须传递您想要进行OCR的确切图像,例如,假设您正在读取胸部数量的玩家,如果您通过胸部数字的裁剪和灰度图像,它将只读取文本,就像你传递整个图像一样,它将无法读取。你可以使用。
String doOCR(BufferedImage img, Rectangle rect);
我直接传递裁剪后的图像,所以我没有使用上述方法,我的代码现在看起来就像这样。
public class JavaApplication10 {
/**
* @param args the command line arguments
*/
public static void main(String[] args)
{
try
{
File imageFile = new File("C:\\Users\\Manesh\\Desktop\\116.jpg"); //This is a cropped image of a chest number.
BufferedImage img = ImageIO.read(imageFile);
//BufferedImageOp grayscaleConv = new ColorConvertOp(colorFrame.getColorModel().getColorSpace(), grayscaleConv.filter(colorFrame, grayFrame);
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
ColorSpace cs = ColorSpace.getInstance(ColorSpace.CS_GRAY);
ColorConvertOp op = new ColorConvertOp(cs, null);
op.filter(img, img); // gray scaling the image
// Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
try
{
String result = instance.doOCR(img);
System.out.println("hahahaha");
System.out.println("The result is: " + result);
}
catch (TesseractException e)
{
System.out.println("error:" + e);
}
}
catch (IOException ex)
{
Logger.getLogger(JavaApplication10.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
这是我发现的,如果我在任何地方都错了,请随时纠正我。