我正在为我的Android OCR应用程序使用Tesseract库,然后我需要为每个角色获得边界框,所以我按照本教程,但是当我编写此代码时,它显示错误,这是我的代码:
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true);
baseApi.init(Path, Lang);
baseApi.setImage(ReadFile.readBitmap(BitmapBiner));
baseApi.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST, CharacterBlacklist);
baseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, CharacterWhitelist);
String RecognizedText = baseApi.getUTF8Text();
List<Rect> characterBoundingBoxes = baseApi.getCharacters().getBoxRects();
BitmapBiner = BitmapBiner.copy(Bitmap.Config.RGB_565, true);
Canvas canvas = new Canvas(BitmapBiner);
// draw bounding box for each character
for (int i = 0; i < characterBoundingBoxes.size(); i++) {
paint.setAlpha(0xFF);
paint.setColor(0xFF00CCFF);
paint.setStyle(Style.STROKE);
paint.setStrokeWidth(1);
Rect r = characterBoundingBoxes.get(i);
canvas.drawRect(r, paint);
}
然后,它显示第8行的错误,它说“方法getCharacters()未定义类型TessBaseAPI”。所以我决定使用另一种方式,它是ResultIterator,这是我的代码:
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true);
baseApi.init(Path, Lang);
baseApi.setImage(ReadFile.readBitmap(BitmapBiner));
baseApi.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST, CharacterBlacklist);
baseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, CharacterWhitelist);
String RecognizedText = baseApi.getUTF8Text();
final ResultIterator iterator = baseApi.getResultIterator();
String lastUTF8Text;
float lastConfidence;
int[] lastBoundingBox;
int count = 0;
iterator.begin();
do {
lastUTF8Text = iterator.getUTF8Text(PageIteratorLevel.RIL_SYMBOL);
lastConfidence = iterator.confidence(PageIteratorLevel.RIL_SYMBOL);
lastBoundingBox = iterator.getBoundingBox(PageIteratorLevel.RIL_SYMBOL);
count++;
} while (iterator.next(PageIteratorLevel.RIL_SYMBOL));
BitmapBiner = BitmapBiner.copy(Bitmap.Config.RGB_565, true);
Canvas canvas = new Canvas(BitmapBiner);
// draw bounding box for each character
for (int i = 0; i < lastBoundingBox.length; i++) {
paint.setAlpha(0xA0);
paint.setColor(Color.RED);
paint.setStyle(Style.STROKE);
paint.setStrokeWidth(1);
Rect r = new Rect(lastBoundingBox[0], lastBoundingBox[1],
lastBoundingBox[2], lastBoundingBox[3]);
canvas.drawRect(r, paint);
}
到目前为止它工作得很好,但现在它现在只有最后一个字符得到了边界框,例如word是“DOG”,...所以在那张图片中唯一的字符是边界框是“G”,. ......其他人没有行边界框,用tesseract库实现这个东西是不可能的,...... ???谢谢
答案 0 :(得分:1)
您应该在do
循环中移动绘图。
更新
BitmapBiner = BitmapBiner.copy(Bitmap.Config.RGB_565, true);
Canvas canvas = new Canvas(BitmapBiner);
paint.setAlpha(0xA0);
paint.setColor(Color.RED);
paint.setStyle(Style.STROKE);
paint.setStrokeWidth(1);
do {
lastUTF8Text = iterator.getUTF8Text(PageIteratorLevel.RIL_SYMBOL);
lastConfidence = iterator.confidence(PageIteratorLevel.RIL_SYMBOL);
lastBoundingBox = iterator.getBoundingBox(PageIteratorLevel.RIL_SYMBOL);
count++;
// draw bounding box for each character
Rect r = new Rect(lastBoundingBox[0], lastBoundingBox[1],
lastBoundingBox[2], lastBoundingBox[3]);
canvas.drawRect(r, paint);
} while (iterator.next(PageIteratorLevel.RIL_SYMBOL));