这是我的代码:
import pytesseract
import cv2
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
def main():
original = cv2.imread('D_Testing.png', 0)
# binary thresh it at value 100. It is now a black and white image
ret, original = cv2.threshold(original, 100, 255, cv2.THRESH_BINARY)
text = pytesseract.image_to_string(original, config='--psm 10')
print(text)
print(pytesseract.image_to_osd(Image.open('D_Testing.png')))
if __name__ == "__main__":
main()
对于第一个输出,我得到了我需要的字母D
D
这是要使用的,但是当它尝试执行第二个打印语句时,它会吐出来。
Traceback (most recent call last):
File "C:/Users/Me/Documents/Python/OpenCV/OpenCV_WokringTest/PytesseractAttempt.py", line 18, in <module>
main()
File "C:/Users/Me/Documents/Python/OpenCV/OpenCV_WokringTest/PytesseractAttempt.py", line 14, in main
print(pytesseract.image_to_osd(Image.open('D_Testing.png')))
File "C:\Users\Me\Documents\Python\OpenCV\OpenCV_WokringTest\venv\lib\site-packages\pytesseract\pytesseract.py", line 402, in image_to_osd
}[output_type]()
File "C:\Users\Me\Documents\Python\OpenCV\OpenCV_WokringTest\venv\lib\site-packages\pytesseract\pytesseract.py", line 401, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\Me\Documents\Python\OpenCV\OpenCV_WokringTest\venv\lib\site-packages\pytesseract\pytesseract.py", line 218, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Me\Documents\Python\OpenCV\OpenCV_WokringTest\venv\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v4.0.0.20181030 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.').
我不确定该怎么办。我真的无法在线上找到太多有关此错误的信息。我也不知道该怎么办。目的仅仅是让它吐出我信的方向。感谢您提前提出所有有用的评论!
答案 0 :(得分:1)
对于OSD功能而言,一个字符太少而无法可靠地检测脚本和方向。
有一个参数min_characters_to_try
控制着截止值。默认情况下为50。因此,您的图片应至少包含50个字符,OSD才能正常工作
> $ tesseract --print-parameters | fgrep characters ...
> min_characters_to_try 50 Specify minimum characters to try during OSD