我正在使用OCR Space API从图像中提取文本。我希望将'ParsedText'分别放在一个字符串中。
import requests
import json
def ocr_space_file(filename, overlay=False, api_key=API_KEY, language='eng'):
""" OCR.space API request with local file.
Python3.5 - not tested on 2.7
:param filename: Your file path & name.
:param overlay: Is OCR.space overlay required in your response.
Defaults to False.
:param api_key: OCR.space API key.
Defaults to 'helloworld'.
:param language: Language code to be used in OCR.
List of available language codes can be found on https://ocr.space/OCRAPI
Defaults to 'en'.
:return: Result in JSON format.
"""
payload = {'isOverlayRequired': overlay,
'apikey': api_key,
'language': language,
}
with open(filename, 'rb') as f:
r = requests.post('https://api.ocr.space/parse/image',
files={filename: f},
data=payload,
)
m = r.content.decode()
jsonstr = json.loads(m)
print jsonstr["ParsedResults"]
ocr_space_file(filename='sample.png', language='eng')
输出:
[{u'ParsedText': u'Python is a great language.', u'FileParseExitCode': 1, u'ErrorMessage': u'', u'TextOverlay': {u'HasOverlay': False, u'Lines': [], u'Message': u'Text overlay is not provided as it is not requested'}, u'ErrorDetails': u''}]
我试过
print jsonstr["ParsedResults"]["ParsedText"]
但是它给出了一个错误:
Traceback (most recent call last):
File "img.py", line 33, in <module>
ocr_space_file(filename='sample.png', language='eng')
File "img.py", line 29, in ocr_space_file
print jsonstr["ParsedResults"]["ParsedText"]
TypeError: list indices must be integers, not str
请帮帮我。
谢谢!
答案 0 :(得分:0)
您的jsonstr["ParsedResults"]
是数组中的单个词典。
[{u'ParsedText': u'Python is a great language.', ... }]
请jsonstr["ParsedResults"][0]
取出字典,例如:
jsonstr["ParsedResults"][0]["ParsedText"]
答案 1 :(得分:0)
使用类似的东西:
print jsonstr["ParsedResults"][0]["ParsedText"]