Pytesseract转换期间“ValueError:无法过滤调色板图像”

时间:2017-04-07 00:26:40

标签: python ocr typeerror tesseract pytesser

有关Pytesseract的以下代码的此错误代码出现问题。 (Python 3.6.1,Mac OSX)

导入pytesseract     导入请求     来自PIL导入图片     来自PIL导入ImageFilter     来自io import StringIO,BytesIO

def process_image(url):
    image = _get_image(url)
    image.filter(ImageFilter.SHARPEN)
    return pytesseract.image_to_string(image)


def _get_image(url):
    r = requests.get(url)
    s = BytesIO(r.content)
    img = Image.open(s)
    return img

process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png")

错误:

/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/g/pyfo/reddit/ocr.py
Traceback (most recent call last):
  File "/Users/g/pyfo/reddit/ocr.py", line 20, in <module>
    process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png")
  File "/Users/g/pyfo/reddit/ocr.py", line 10, in process_image
    image.filter(ImageFilter.SHARPEN)
  File "/usr/local/lib/python3.6/site-packages/PIL/Image.py", line 1094, in filter
    return self._new(filter.filter(self.im))
  File "/usr/local/lib/python3.6/site-packages/PIL/ImageFilter.py", line 53, in filter
    raise ValueError("cannot filter palette images")
ValueError: cannot filter palette images

Process finished with exit code 1

看似简单,但不起作用。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:3)

您拥有的图像是基于托盘的图像。您需要将其转换为完整的model = Sequential() model.add(Masking(mask_value=0., input_shape=(max_time, 24))) model.add(LSTM(256, input_dim=24)) model.add(Dense(1024)) model.add(Dense(2)) model.add(Activation(activate)) model.compile(loss=weibull_loglik_discrete, optimizer=RMSprop(lr=.01)) model.fit(train_x, train_y, nb_epoch=100, batch_size=1000, verbose=2, validation_data=(test_x, test_y)) 图片才能使用PIL过滤器。

RGB

您还应注意,import pytesseract import requests from PIL import Image, ImageFilter from io import StringIO, BytesIO def process_image(url): image = _get_image(url) image = image.convert('RGB') image = image.filter(ImageFilter.SHARPEN) return pytesseract.image_to_string(image) def _get_image(url): r = requests.get(url) s = BytesIO(r.content) img = Image.open(s) return img process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png") .convert()方法会返回图像的副本,但它们不会更改现有的图像对象。您需要将返回值分配给变量,如上面的代码所示。

注意:我没有pytesseract,所以我无法检查.filter()的最后一行。