在使用vision API扫描文本时,Overlay会将多个文本框作为未排序列表返回。因此,当我通过循环读取文本时,有时我会以错误的顺序获取文本,即,首先显示页面底部的文本。
OcrDetectorProcessor.java中的receiveDetections示例代码
@Override
public void receiveDetections(Detector.Detections<TextBlock> detections) {
mGraphicOverlay.clear();
SparseArray<TextBlock> items = detections.getDetectedItems();
for (int i = 0; i < items.size(); ++i) {
TextBlock item = items.valueAt(i);
OcrGraphic graphic = new OcrGraphic(mGraphicOverlay, item);
mGraphicOverlay.add(graphic);
}
}
在此代码中,我想根据TextBlock的位置对mGraphicOverlay
列表进行排序。
如果有任何解决方案/建议,那么对我来说非常有帮助。
答案 0 :(得分:4)
您需要按照OCR示例代码中所示对输出进行排序。我在排序之前将文本块分成几行。
这是我的代码:
List<Text> textLines = new ArrayList<>();
for (int i = 0; i < origTextBlocks.size(); i++) {
TextBlock textBlock = origTextBlocks.valueAt(i);
List<? extends Text> textComponents = textBlock.getComponents();
for (Text currentText : textComponents) {
textLines.add(currentText);
}
}
Collections.sort(textLines, new Comparator<Text>() {
@Override
public int compare(Text t1, Text t2) {
int diffOfTops = t1.getBoundingBox().top - t2.getBoundingBox().top;
int diffOfLefts = t1.getBoundingBox().left - t2.getBoundingBox().left;
if (diffOfTops != 0) {
return diffOfTops;
}
return diffOfLefts;
}
});
StringBuilder textBuilder = new StringBuilder();
for (Text text : textLines) {
if (text != null && text.getValue() != null) {
textBuilder.append(text.getValue() + "\n");
}
}
String ocrString = textBuilder.toString();
答案 1 :(得分:3)
我创建了这样的文本块比较器。
public static Comparator<TextBlock> TextBlockComparator
= new Comparator<TextBlock>() {
public int compare(TextBlock textBlock1, TextBlock textBlock2) {
return textBlock1.getBoundingBox().top - textBlock2.getBoundingBox().top;
}
};
使用Arrays.sort(myTextBlocks, Utils.TextBlockComparator);
<强>更新强>
今天我有时间测试@ rajesh的Answer。文本块排序似乎比文本行排序更准确。
有关完整教程,请查看Simple example of OCRReader in Android
答案 2 :(得分:0)
好吧,如果你有时间,请测试我的代码。它经过精心设计并经过了大量时间的测试。设计采用sparseArray(如api给出)并返回相同但已排序。希望它能帮到你。
/**
* Taking all the textblock in the frame, sort them to be at the same
* location as it is in real life (not as the original output)
* it return the sparsearray with the same textblock but sorted
*/
private SparseArray<TextBlock> sortTB(SparseArray<TextBlock> items) {
if (items == null) {
return null;
}
int size = items.size();
if (size == 0) {
return null;
}
//SparseArray to store the result, the same that the one in parameters but sorted
SparseArray<TextBlock> sortedSparseArray = new SparseArray<>(size);
//Moving from SparseArray to List, to use Lambda expression
List<TextBlock> listTest = new ArrayList<>();
for (int i = 0; i < size; i++) {
listTest.add(items.valueAt(i));
}
//sorting via a stream and lambda expression, then collecting the result
listTest = listTest.stream().sorted((textBlock1, textBlock2) -> {
RectF rect1 = new RectF(textBlock1.getComponents().get(0).getBoundingBox());
RectF rect2 = new RectF(textBlock2.getComponents().get(0).getBoundingBox());
//Test if textBlock are on the same line
if (rect2.centerY() < rect1.centerY() + SAME_LINE_DELTA
&& rect2.centerY() > rect1.centerY() - SAME_LINE_DELTA) {
//sort on the same line (X value)
return Float.compare(rect1.left, rect2.left);
}
//else sort them by their Y value
return Float.compare(rect1.centerY(), rect2.centerY());
}).collect(Collectors.toList());
//Store the result to the empty sparseArray
for (int i = 0; i < listTest.size(); i++) {
sortedSparseArray.append(i, listTest.get(i));
}
//return the sorted result
return sortedSparseArray;
}