Question

Google Cloud Vision API的文本识别功能能够检测多种语言，并且可以在单个图像中检测多种语言。但是，目前，OCR功能仅适用于一组Supported languages。

还有正在开发的实验语言列表（也在here中列出）。但是，其中一些实验语言在其他Google产品（例如Google Lens）中运行良好。例如，“ Sinhala”语言（සිංහලSinhala si）在实验部分中；但是在Google Lens中，它以这种语言为OCR提供了非常准确的结果。

问题：我需要通过使用Google Cloud Vision API为OCR使用一些实验语言。有什么方法可以使用这些实验语言？

参考文献： cloud.google.com/vision/docs/ocr

示例代码：在下面的python代码中，语言由"language_hints": ["en"]设置（英语为en）。

def detect_text(path):
    """Detects text in the file."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)

    #setting language
    response = client.text_detection(
        image=image,
        image_context={"language_hints": ["si"]}, #si for Sinhala language
    )

    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

detect_text('image.png')

通过使用上面的代码，可以提取图像中的文本。对于“受支持”语言（例如英语），它运行良好。在上面的代码中，设置了一种“实验”语言（Sinhala（si）），它不起作用。

Google Could Vision API-将实验语言用于OCR

0 个答案: