例如,我想检测一个手写的编码字符串,例如“ A5b1x”。因此,我要么手动拆分它,以便获得每个角色的图像,要么尝试让Vision立即识别它。两者目前都无法正常工作,因为我不确定如何指定它不是一种语言(或指定它是单数字符)。这是我在Google计算实例中输入的内容:
gcloud ml vision detect-document "weblink to image"
“ g”的图片没有结果: g
“ e”的图片没有结果:e
“ fxb3”图像的结果:fxb3
{
"responses": [
{
"fullTextAnnotation": {
"pages": [
{
"blocks": [
{
"blockType": "TEXT",
"boundingBox": {
"vertices": [
{
"x": 2433,
"y": 1289
},
{
"x": 1498,
"y": 1336
},
{
"x": 1468,
"y": 737
},
{
"x": 2403,
"y": 691
}
]
},
"confidence": 0.56,
"paragraphs": [
{
"boundingBox": {
"vertices": [
{
"x": 2433,
"y": 1289
},
{
"x": 1498,
"y": 1336
},
{
"x": 1468,
"y": 737
},
{
"x": 2403,
"y": 691
}
]
},
"confidence": 0.56,
"words": [
{
"boundingBox": {
"vertices": [
{
"x": 2433,
"y": 1289
},
{
"x": 1498,
"y": 1336
},
{
"x": 1468,
"y": 737
},
{
"x": 2403,
"y": 691
}
]
},
"confidence": 0.56,
"symbols": [
{
"boundingBox": {
"vertices": [
{
"x": 2433,
"y": 1289
},
{
"x": 2135,
"y": 1304
},
{
"x": 2105,
"y": 706
},
{
"x": 2403,
"y": 691
}
]
},
"confidence": 0.4,
"text": "\u0967"
},
{
"boundingBox": {
"vertices": [
{
"x": 2063,
"y": 1308
},
{
"x": 1788,
"y": 1322
},
{
"x": 1758,
"y": 723
},
{
"x": 2033,
"y": 710
}
]
},
"confidence": 0.62,
"text": "\u0967"
},
{
"boundingBox": {
"vertices": [
{
"x": 1750,
"y": 1323
},
{
"x": 1498,
"y": 1336
},
{
"x": 1468,
"y": 737
},
{
"x": 1720,
"y": 725
}
]
},
"confidence": 0.67,
"property": {
"detectedBreak": {
"type": "LINE_BREAK"
}
},
"text": "X"
}
]
}
]
}
]
}
],
"height": 2112,
"width": 4608
}
],
"text": "\u0967\u0967X\n"
},
"textAnnotations": [
{
"boundingPoly": {
"vertices": [
{
"x": 1467,
"y": 690
},
{
"x": 2432,
"y": 690
},
{
"x": 2432,
"y": 1335
},
{
"x": 1467,
"y": 1335
}
]
},
"description": "\u0967\u0967X\n",
"locale": "und"
},
{
"boundingPoly": {
"vertices": [
{
"x": 2433,
"y": 1289
},
{
"x": 1498,
"y": 1336
},
{
"x": 1468,
"y": 737
},
{
"x": 2403,
"y": 691
}
]
},
"description": "\u0967\u0967X"
}
]
}
]
}
答案 0 :(得分:0)
Google Cloud Vision API目前无法识别单个字符。提交了有关字符识别here的功能请求。请对其加注星标,以便您可以接收有关此功能请求的更新,并随时添加其他注释以提供所需实现的详细信息。
关于您关于识别“编码”字符串的问题,Vision API可以做到这一点。我已成功尝试将带有fxb3的图像传递到API,结果很好(这里是image1和image2)。您从API获得的响应是两个连续的unicode characters和“ x”。写作的质量是导致响应非常差的原因。 OCR的模型正在不断改进,但目前无法正确检测出可能被认为不太清晰的笔迹。