当我打开图像并进行数字化处理时,会出现错误 我在jupyter笔记本Windows 10上运行此代码。我也使用pip命令安装了pytesseract和tesseract
.value()
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
# If you don't have tesseract executable in your PATH, include the following:
# pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'
# Example tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'
# Simple image to string
print(pytesseract.image_to_string(Image.open('Train/TR_1.jpg')))
我在jupyter笔记本Windows 10上运行此代码。我也使用pip命令安装了pytesseract和tesseract
答案 0 :(得分:0)
您必须先安装tesseract本身。在Centos中,您可以通过运行
来执行此操作yum-config-manager --add-repo
https://download.opensuse.org/repositories/home:/Alexander_Pozdnyakov/CentOS_7/
rpm --import https://build.opensuse.org/projects/home:Alexander_Pozdnyakov/public_key
yum install -y tesseract tesseract-langpack-deu
在Windows操作系统中也应该有一个等效项,可以在这里https://github.com/tesseract-ocr/tesseract/wiki
找到pytesseract只是tesseract软件包的包装。
答案 1 :(得分:0)
尝试添加此行以及路径(例如)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
之前
print(pytesseract.image_to_string(Image.open('Train/TR_1.jpg')))
以前已下载并安装了Windows https://github.com/tesseract-ocr/tesseract/wiki的可执行文件