我在Windows 10 x64
上使用pytesseract,而python是3.5.2 x64
,Tesseract是4.0
,代码如下:
# -*- coding: utf-8 -*-
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
错误:
Traceback (most recent call last):
File "D:/test.py", line 10, in <module>
print(pytesseract.image_to_string(Image.open('d:/testimages/name.gif'), lang='chi_sim'))
File "C:\Users\dell\AppData\Local\Programs\Python\Python35\lib\site-packages\pytesseract\pytesseract.py", line 165, in image_to_string
raise TesseractError(status, errors)
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\tessdata/chi_sim.traineddata')
C:\Program Files (x86)\Tesseract-OCR\tessdata
,就像这样:
为什么会这样?
答案 0 :(得分:0)
将TESSDATA_PREFIX
环境变量设置为C:\Program Files (x86)\Tesseract-OCR\
答案 1 :(得分:0)
如果您有tessdata错误,例如:“打开数据文件时出错......”
tessdata_dir_config = '--tessdata-dir "<replace_with_your_tessdata_dir_path>"'
# Example config: '--tessdata-dir "C:\\Program Files (x86)\\Tesseract-OCR\\tessdata"'
# It's important to add double quotes around the dir path.
pytesseract.image_to_string(image, lang='chi_sim', config=tessdata_dir_config)