我正在为Android构建一个OCR应用程序,我使用tesseract ocr引擎。不知何故,每当我在照片上使用引擎时,它都会返回一个空文本。 这是我的代码:
public String detectText(Bitmap bitmap) {
TessBaseAPI tessBaseAPI = new TessBaseAPI();
String mDataDir = setTessData();
tessBaseAPI.setDebug(true);
tessBaseAPI.init(mDataDir + File.separator, "eng");
tessBaseAPI.setImage(bitmap);
tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_ONLY);
String text = tessBaseAPI.getUTF8Text();
tessBaseAPI.end();
return text;
}
private String setTessData(){
String mDataDir = this.getExternalFilesDir("data").getAbsolutePath();
String mTrainedDataPath = mDataDir + File.separator + "tessdata";
String mLang = "eng";
// Checking if language file already exist inside data folder
File dir = new File(mTrainedDataPath);
if (!dir.exists()) {
if (!dir.mkdirs()) {
//showDialogFragment(SD_ERR_DIALOG, "sd_err_dialog");
} else {
}
}
if (!(new File(mTrainedDataPath + File.separator + mLang + ".traineddata")).exists()) {
// If English or Hebrew, we just copy the file from assets
if (mLang.equals("eng") || mLang.equals("heb")){
try {
AssetManager assetManager = context.getAssets();
InputStream in = assetManager.open(mLang + ".traineddata");
OutputStream out = new FileOutputStream(mTrainedDataPath + File.separator + mLang + ".traineddata");
copyFile(in, out);
//Toast.makeText(context, getString(R.string.selected_language) + " " + mLangArray[mLangID], Toast.LENGTH_SHORT).show();
//Log.v(TAG, "Copied " + mLang + " traineddata");
} catch (IOException e) {
//showDialogFragment(SD_ERR_DIALOG, "sd_err_dialog");
}
}
else{
// Checking if Network is available
if (!isNetworkAvailable(this)){
//showDialogFragment(NETWORK_ERR_DIALOG, "network_err_dialog");
}
else {
// Shows a dialog with File dimension. When user click on OK download starts. If he press Cancel revert to english language (like NETWORK ERROR)
//showDialogFragment(CONTINUE_DIALOG, "continue_dialog");
}
}
}
else {
//Toast.makeText(mThis, getString(R.string.selected_language) + " " + mLangArray[mLangID], Toast.LENGTH_SHORT).show();
}
return mDataDir;
}
我已多次调试它,并且位图正确地传输到detectText方法。语言数据文件(tessdata)存在于手机上,并且它们的路径也是正确的。
有人知道这里有什么问题吗?
答案 0 :(得分:1)
您正在使用OCR引擎模式枚举值在setTessData()方法中设置页面分段。
setTessData() {
...
tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_ONLY);
}
根据您尝试检测字符的图像类型,设置适当的页面分割模式有助于检测字符。
例如:
tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO);
TessBaseApi.java中存在各种其他页面分段值:
/** Page segmentation mode. */
public static final class PageSegMode {
/** Orientation and script detection only. */
public static final int PSM_OSD_ONLY = 0;
/** Automatic page segmentation with orientation and script detection. (OSD) */
public static final int PSM_AUTO_OSD = 1;
/** Fully automatic page segmentation, but no OSD, or OCR. */
public static final int PSM_AUTO_ONLY = 2;
/** Fully automatic page segmentation, but no OSD. */
public static final int PSM_AUTO = 3;
/** Assume a single column of text of variable sizes. */
public static final int PSM_SINGLE_COLUMN = 4;
/** Assume a single uniform block of vertically aligned text. */
public static final int PSM_SINGLE_BLOCK_VERT_TEXT = 5;
/** Assume a single uniform block of text. (Default.) */
public static final int PSM_SINGLE_BLOCK = 6;
/** Treat the image as a single text line. */
public static final int PSM_SINGLE_LINE = 7;
/** Treat the image as a single word. */
public static final int PSM_SINGLE_WORD = 8;
/** Treat the image as a single word in a circle. */
public static final int PSM_CIRCLE_WORD = 9;
/** Treat the image as a single character. */
public static final int PSM_SINGLE_CHAR = 10;
/** Find as much text as possible in no particular order. */
public static final int PSM_SPARSE_TEXT = 11;
/** Sparse text with orientation and script detection. */
public static final int PSM_SPARSE_TEXT_OSD = 12;
/** Number of enum entries. */
public static final int PSM_COUNT = 13;
}
您可以尝试使用不同的页面分段枚举值,并查看哪种值可以获得最佳结果。