当我运行下面的代码时,它会报告:
追踪(最近一次呼叫最后一次):
File "ocr2.py", line 20, in <module> image_jpeg = image_pdf.convert('jpeg') File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/wand/image.py", line 3032, in convert cloned.format = format File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/wand/image.py", line 2932, in format raise ValueError(repr(fmt) + ' is unsupported format') ValueError: 'jpeg' is unsupported format
任何人都可以帮我解决这个问题吗?我不明白为什么它说'jpeg'是不受支持的格式。
from wand.image import Image
from PIL import Image as PI
import pyocr
import pyocr.builders
import io
tool = pyocr.get_available_tools()[0]
lang = tool.get_available_languages()[1]
req_image = []
final_text = []
image_pdf = Image(filename="./test.pdf", resolution=300)
image_jpeg = image_pdf.convert('jpeg')
for img in image_jpeg.sequence:
img_page = Image(image=img)
req_image.append(img_page.make_blob('jpeg'))
for img in req_image:
txt = tool.image_to_string(
PI.open(io.BytesIO(img)),
lang=lang,
builder=pyocr.builders.TextBuilder()
)
final_text.append(txt)
答案 0 :(得分:0)
使用image_pdf.convert('JPEG')
。
ImageMagick Image Formats
ImageMagick使用称为magick的ASCII字符串(例如GIF)来识别文件格式