我正在使用pytesseract
lib从图像中提取文本。当我在localhost上运行代码时,这工作正常。但是当我在openshift上部署时,会给我上面的错误。
下面是我到目前为止编写的代码。
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
filePath = PATH_WHERE_FILE_IS_LOCATED # '/var/lib/openshift/555.../app-root/data/data/y.jpg'
text = pytesseract.image_to_string(Image.open(filePath)) # this line produces error
上述错误的追溯是
>>> pytesseract.image_to_string(Image.open(filePath))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/var/lib/openshift/56faaee42d527151d5000089/app- root/runtime/repo/pytesseract/pytesseract.py", line 132, in image_to_string
boxes=boxes)
File "/var/lib/openshift/56faaee42d527151d5000089/app-root/runtime/repo/pytesseract/pytesseract.py", line 73, in run_tesseract
stderr=subprocess.PIPE)
File "/opt/rh/python27/root/usr/lib64/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
File "/opt/rh/python27/root/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
但Image.open(filePath)
返回对象引用
<PIL.PngImagePlugin.PngImageFile image mode=RGBA size=1366x768 at 0x7FC5A9F719D0>
如何删除此错误?提前谢谢!!
答案 0 :(得分:4)
要么你没有在&#34; openshift&#34;上安装tesseract-ocr,要么它不在你的PATH中。见https://pypi.python.org/pypi/pytesseract/0.1 检查是否可以从命令行执行tesseract命令。
答案 1 :(得分:4)
如上所述[{3}}安装here
您可以tesseract-ocr运行命令。可以找到更多特定于窗口的详细信息rhc ssh。
答案 2 :(得分:4)
恕我直言,如果我理解openhift,它可能像Heroku,文件系统是易变的,路径必须略有不同或完全不同,所以,首先检查:
我希望我有所帮助
答案 3 :(得分:4)
尝试此代码,并检查错误的位置:
try:
import Image
print("image not from PIL")
except ImportError:
print("image from PIL")
from PIL import Image
import pytesseract
import os
filePath = PATH_WHERE_FILE_IS_LOCATED # '/var/lib/openshift/555.../app-root/data/data/y.jpg'
if not os.path.exist(filePath):
print("no image file")
I=None
try:
I=Image.open(filePath)
except Exception as e:
raise RuntimeError(" Can't open image because %s"% e)
text = pytesseract.image_to_string(I) # this line produces error
PS: 您可以打印这样的模块版本:
print Image.__version__
答案 4 :(得分:3)
我认为您可能没有输入正确的图像路径。你应该控制你的路径。
您还验证了tesseract-ocr的安装吗? 您应该看到使用导入功能调用模块并从命令行实用程序检查模块时不会产生错误。
正如Wuelfhis Asuaje所说,你应该确保你有足够的权限来访问路径中的文件。
答案 5 :(得分:2)
您应该从http://code.google.com/p/tesseract-ocr/安装google tesseract-ocr。
确保服务器上有tesseract
命令。
在幕后,pytesseract
使用tesseract
(https://github.com/madmaze/pytesseract/blob/master/src/pytesseract.py#L93)调用subprocess
命令:
proc = subprocess.Popen(command,
stderr=subprocess.PIPE)
现在猜猜如果命令不可用会发生什么?
In [45]: subprocess.Popen(['tesseract'])
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-45-f4e9dd5a7f0b> in <module>()
----> 1 subprocess.Popen(['tesseract'])
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
708 p2cread, p2cwrite,
709 c2pread, c2pwrite,
--> 710 errread, errwrite)
711 except Exception:
712 # Preserve original exception in case os.close raises.
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1333 raise
1334 child_exception = pickle.loads(data)
-> 1335 raise child_exception
1336
1337
OSError: [Errno 2] No such file or directory