Question

我正在尝试开始使用pytesseract，但正如您在下面看到的那样，我遇到了问题。

我发现人们得到的似乎是同样的错误，他们说这是PIL 1.1.7中的一个错误。其他人说这个问题是由于PIL是懒惰引起的，并且需要强制PIL在打开后用im.load()加载图像，但这似乎没有帮助。感激地收到任何建议。

K:\Glamdring\Projects\Images\OCR>python
Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from PIL import Image
>>> import pytesseract
>>> pytesseract.image_to_string(Image.open('foo.png'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 143, in image_to_string
  File "c:\Python27_32\lib\site-packages\PIL\Image.py", line 1497, in split
    if self.im.bands == 1:
AttributeError: 'NoneType' object has no attribute 'bands'

Answer 1

尝试分别使用Image和pytesseract模块中的对象。
它解决了我的问题：

try:
    import Image
except ImportError:
    from PIL import Image
import pytesseract

img = Image.open('myImage.jpg')
img.load()
i = pytesseract.image_to_string(img)
print i

Answer 2

我没有以前的经验 与PIL，但我感到无聊所以我试图调查它，据我所知，它是<强> 可能一个错误。

如果我们查看执行步骤，这不是pytesseract的错误。

最初，您的Image.open('foo.png')工作正常，没有与堆栈跟踪相关的错误。
pytesseract.image_to_string(img)之后出现并执行以下操作：
```
# Omitting the rest of the method.

# calls method split() of your image object.
if len(image.split()) == 4:
```
这是代表image的 第一个 语句，因此我们知道我们必须回顾PIL以查找问题的根源

您的stacktrace具有与AttributeError: 'NoneType' object has no attribute 'bands'语句相关的特定消息if self.im.bands。这意味着im是object = None。

让我们看一下image.split()方法：

"""
Split this image into individual bands. This method returns a
tuple of individual image bands from an image. For example,
splitting an "RGB" image creates three new images each
containing a copy of one of the original bands (red, green,
blue).

:returns: A tuple containing bands.
"""

self.load() # This is the culprit since..
if self.im.bands == 1: # .. here the im attribute of the image = None
    ims = [self.copy()]

# Omitting the rest ---

显然self.load()设置了im值。我用测试图像对此进行了验证，它似乎没有问题[我建议您尝试使用相同的图像]：

In [7]: print img.im
None

In [8]: img.load()
Out[8]: <PixelAccess at 0x7fe03ab6a210>

In [9]: print img.im
<ImagingCore object at 0x7fe03ab6a1d0>

现在让我们来看看load()：我通常不具备在这里了解内部知识的知识，但我确实在im的分配之前发现了很多 FIXME 的评论，特别是：

# -- Omitting rest --         

# FIXME: on Unix, use PROT_READ etc
self.map = mmap.mmap(file.fileno(), size)
self.im = Image.core.map_buffer(
                    self.map, self.size, d, e, o, a
                    )

# -- Omitting rest --

if hasattr(self, "tile_post_rotate"):
    # FIXME: This is a hack to handle rotated PCD's
    self.im = self.im.rotate(self.tile_post_rotate)
    self.size = self.im.size

这可能表明这里可能存在一些需要注意的问题。 虽然我无法100％确定。

当然，出于某种原因，这可能是由 您的图片 引起的。 load()方法适用于我提供的图像（pytesseract只是给了我一个不同的错误：P）。你可能最好为此创建一个新的issue。如果任何PIL专家碰巧看到这一点，请尽可能启发我们。

Answer 3

im.load()在管理员模式下运行程序时为我工作，如果您的PATH中没有tesseract可执行文件，也可以添加此行

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'

如果您已经读过图像（不是使用im.load（）但是使用imread（））或视频中的帧并且对该变量（图像）执行了一些图像处理（可能不是），那么您需要给出以下命令 pytesseract.image_to_string（Image.fromarray（图像））

Answer 4

正如@J_Mascis所说，使用对象也在这里工作 -

    import pytesseract
    from PIL import Image
    img = Image.open('im.jpg')
    img.load()

    print(pytesseract.image_to_string(img, lang='eng'))#'eng' for english

为什么pytesseract导致AttributeError：＆＃39; NoneType＆＃39;对象没有属性＆＃39; band＆＃39;？

4 个答案: