我在ios应用中使用Firebase ML-Kit来识别图像(图像是一种形式)中的文本。使用以下示例https://www.raywenderlich.com/6565-ml-kit-tutorial-for-ios-recognizing-text-in-images
效果很好但是,我只想识别表单特定区域中的文本,并将其存储在Firebase实时数据库中。
我如何定义表单区域来限制返回的文本并将其标记为推送到数据库?
我找到了一个识别关键字的python示例:
def assemble_word(word):
assembled_word=""
for symbol in word.symbols:
assembled_word+=symbol.text
return assembled_word
def find_word_location(document,word_to_find):
for page in document.pages:
for block in page.blocks:
for paragraph in block.paragraphs:
for word in paragraph.words:
assembled_word=assemble_word(word)
if(assembled_word==word_to_find):
return word.bounding_box
location=find_word_location(document,'Overdrafts')
####Output :
vertices {
x: 131
y: 130
}
vertices {
x: 200
y: 130
}
vertices {
x: 200
y: 145
}
vertices {
x: 131
y: 145
}
然后假设从“ Overdrafts”一词的右边开始的框的宽度与Overdrafts的宽度相同,使用它来定义输出文本:
def text_within(document,x1,y1,x2,y2):
text=""
for page in document.pages:
for block in page.blocks:
for paragraph in block.paragraphs:
for word in paragraph.words:
for symbol in word.symbols:
min_x=min(symbol.bounding_box.vertices[0].x,symbol.bounding_box.vertices[1].x,symbol.bounding_box.vertices[2].x,symbol.bounding_box.vertices[3].x)
max_x=max(symbol.bounding_box.vertices[0].x,symbol.bounding_box.vertices[1].x,symbol.bounding_box.vertices[2].x,symbol.bounding_box.vertices[3].x)
min_y=min(symbol.bounding_box.vertices[0].y,symbol.bounding_box.vertices[1].y,symbol.bounding_box.vertices[2].y,symbol.bounding_box.vertices[3].y)
max_y=max(symbol.bounding_box.vertices[0].y,symbol.bounding_box.vertices[1].y,symbol.bounding_box.vertices[2].y,symbol.bounding_box.vertices[3].y)
if(min_x >= x1 and max_x <= x2 and min_y >= y1 and max_y <= y2):
text+=symbol.text
if(symbol.property.detected_break.type==1 or
symbol.property.detected_break.type==3):
text+=' '
if(symbol.property.detected_break.type==2):
text+='\t'
if(symbol.property.detected_break.type==5):
text+='\n'
return text
text_within(document, location.vertices[1].x, location.vertices[1].y, 30+location.vertices[1].x+(location.vertices[1].x-location.vertices[0].x),location.vertices[2].y)
### OUTPUT
'$ 511,789.61 '
我可以用Swift做类似的事情,还是有更好的方法?