二进制图像上的OCR

时间:2019-03-21 14:50:55

标签: ocr google-cloud-vision python-tesseract

我有一个像这样的二进制文本图像black on white text - cat

我想对此类图像执行OCR。它们包含不超过一个字。 我已经尝试过tesseract和Google云视觉,但是它们都没有返回结果。 我正在使用python 3.6和Windows 10。

False

对于这两个图像中的任何一个,此图像都应该是一个简单的任务,我觉得我的代码中缺少某些内容。请帮帮我!

编辑:

感谢 F10 指向我正确的方向。这就是我如何使用它处理本地图像。

# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

# Instantiates a client
client = vision.ImageAnnotatorClient()

with io.open("test.png", 'rb') as image_file:
    content = image_file.read()

image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
resp = ''

for text in texts:
    resp+=' ' + text.description

print(resp)

from PIL import Image as im
import pytesseract as ts
print(ts.image_to_string(im.fromarray(canvas.reshape((480,640)),'L'))) # canvas contains the Mat object from which the image is saved to png

1 个答案:

答案 0 :(得分:1)

基于this document,我使用了以下代码,并能够获得text: "cat\n"作为输出:

from pprint import pprint

# Imports the Google Cloud client library
from google.cloud import vision

# Instantiates a client
client = vision.ImageAnnotatorClient()

# The name of the image file to annotate
response = client.annotate_image({
  'image': {'source': {'image_uri': 'gs://<your_bucket>/ORW90.png'}},
  'features': [{'type': vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}],
})

pprint(response)

希望有帮助。