通过Cloud Vision API从图像中提取页面和内容

时间:2019-12-19 18:19:10

标签: python python-3.x ocr google-cloud-vision vision-api

我试图通过以下代码从图像中提取JSON:-

E/launcher - Error forwarding the new session cannot find : Capabilities {baseUrl: http://..., browserName: internet explorer, count: 1, specs: [spec/test.js]}

E/launcher - WebDriverError: Error forwarding the new session cannot find : Capabilities {baseUrl: http://..., browserName: internet explorer, count: 1, specs: [spec/test.js]}

从上面的代码中,我只得到JSON格式的响应。

我想获取“页面,页面宽度,页面宽度,段落,行,单词等”。

是否可以获取上述内容(“页面,Page_hieght,Page_width,段落,行,单词”)?

有人知道如何获得这个吗?

注意:-,我只有密钥,没有JSON格式的密钥。我使用此命令运行文件 from base64 import b64encode from os import makedirs from os.path import join, basename from sys import argv import json import requests ENDPOINT_URL = 'https://vision.googleapis.com/v1/images:annotate' RESULTS_DIR = 'jsons' makedirs(RESULTS_DIR, exist_ok=True) def make_image_data_list(image_filenames): """ image_filenames is a list of filename strings Returns a list of dicts formatted as the Vision API needs them to be """ img_requests = [] for imgname in image_filenames: with open(imgname, 'rb') as f: ctxt = b64encode(f.read()).decode() img_requests.append({ 'image': {'content': ctxt}, 'features': [{ 'type': 'TEXT_DETECTION', 'maxResults': 1 }] }) return img_requests def make_image_data(image_filenames): """Returns the image data lists as bytes""" imgdict = make_image_data_list(image_filenames) return json.dumps({"requests": imgdict }).encode() def request_ocr(api_key, image_filenames): response = requests.post(ENDPOINT_URL, data=make_image_data(image_filenames), params={'key': api_key}, headers={'Content-Type': 'application/json'}) return response if __name__ == '__main__': api_key, *image_filenames = argv[1:] if not api_key or not image_filenames: print(""" Please supply an api key, then one or more image filenames $ python cloudvisreq.py api_key image1.jpg image2.png""") else: response = request_ocr(api_key, image_filenames) if response.status_code != 200 or response.json().get('error'): print(response.text) else: for idx, resp in enumerate(response.json()['responses']): # save to JSON file imgname = image_filenames[idx] jpath = join(RESULTS_DIR, basename(imgname) + '.json') with open(jpath, 'w') as f: datatxt = json.dumps(resp, indent=2) print("Wrote", len(datatxt), "bytes to", jpath) f.write(datatxt) # print the plaintext to screen for convenience print("---------------------------------------------") t = resp['textAnnotations'][0] print(" Bounding Polygon:") print(t['boundingPoly']) print(" Text:") print(t['description'])

0 个答案:

没有答案