Question

我有一个Python函数，该函数调用Google的Cloud Vision API将OCR转换为pdf文件。它在大多数时间都有效，但有时会返回空响应。有时它还会为第一页返回正确的数据，然后为其他页面返回空数据。

我正在使用的代码如下：

from google.cloud import vision

def ocr(filename):

    mime_type = 'application/pdf'
    batch_size = 100

    feature = vision.types.Feature(type=vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
    gcs_source = vision.types.GcsSource(uri=filename)
    input_config = vision.types.InputConfig(gcs_source=gcs_source, mime_type=mime_type)
    image_context = vision.types.ImageContext(language_hints=['pt'])

    request = vision.types.AnnotateFileRequest(
        features=[feature],
        input_config=input_config,
        image_context=image_context
    )

    operation = vision_client.batch_annotate_files(requests=[request])
    return operation.responses[0]

给我这个问题的最新PDF是4页pdf，存储在GCS中。这是一个干净的扫描文档，在所有4页上均带有文本。它给我的结果如下：

responses {
  full_text_annotation {}
  context {page_number: 1}
}
responses {
  full_text_annotation {}
  context {page_number: 2}
}
responses {
  full_text_annotation {}
  context {page_number: 3}
}
responses {
  full_text_annotation {}
  context {page_number: 4}
}
total_pages: 4

我使用的是Python 3.6和Google Cloud Vision v。0.40.0

来自Google Cloud Vision的空响应DOCUMENT_TEXT_DETECTION

0 个答案: