在Windows 10上使用Python 3.7,Pycharm和tesseract OCR入门时出现错误

时间:2018-07-10 16:55:55

标签: windows pycharm ocr python-tesseract python-3.7

[编辑]

我正在尝试完成一个使用树莓派和相机使用OCR的项目,但是直到我收到邮件中的树莓派之前,我想在python上练习编程并在Windows上学习OCR的基础知识机器,还有一些我尚未解决的问题,并且无法编译给出的一些示例。

我将“ Pycharm”用作我的python IDE,并遵循一些教程,我无法真正理解tesseract OCR库应该看起来像是模块,程序包还是可执行文件的样子?

据我了解,您还需要像PIL AKA枕头这样的库来处理图像,与tesseract一样,如何在IDE中引用这些库进行编译,我可以通过设置将其添加,但是它们在哪里被存储,如果我要安装无法通过PIP安装的库怎么办?

我一开始要实现的目标是加载一个简单的PNG图像,并将其中的文本作为字符串获取。

我也尝试阅读以下主题:Getting started with Python OCR on windows?

但即使在修复之后,该示例也不会编译teseract的东西

我也遇到了这个话题:use pytesseract to recognize text from image

并从中提取示例并对其进行了一些修改,使其看起来像这样:

import pytesseract
from PIL import Image, ImageEnhance, ImageFilter

im = Image.open("example1.png")

'''
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
'''

text = pytesseract.image_to_string(Image.open('example1.png'))
print(text)

这就是文件夹的样子: Folder image

这些是我在运行示例时遇到的错误:

   C:\Users\Mike\OneDrive\Code\PycharmProjects\Hello_world\venv\Scripts\python.exe C:/Users/Mike/OneDrive/Code/Python_programming/Hello_OCR/Hello_OCR.py
Traceback (most recent call last):
  File "C:\Users\Mike\OneDrive\Code\PycharmProjects\Hello_world\venv\lib\site-packages\pytesseract\pytesseract.py", line 194, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\Mike\OneDrive\Code\PycharmProjects\Hello_world\venv\lib\site-packages\pytesseract\pytesseract.py", line 165, in run_tesseract
    proc = subprocess.Popen(command, **subprocess_args())
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "C:\Users\Mike\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1155, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Mike/OneDrive/Code/Python_programming/Hello_OCR/Hello_OCR.py", line 14, in <module>
    text = pytesseract.image_to_string(Image.open('example1.png'))
  File "C:\Users\Mike\OneDrive\Code\PycharmProjects\Hello_world\venv\lib\site-packages\pytesseract\pytesseract.py", line 286, in image_to_string
    return run_and_get_output(image, 'txt', lang, config, nice)
  File "C:\Users\Mike\OneDrive\Code\PycharmProjects\Hello_world\venv\lib\site-packages\pytesseract\pytesseract.py", line 201, in run_and_get_output
    raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

Process finished with exit code 1

解决此问题可能是什么问题?

感谢您的帮助!

0 个答案:

没有答案