如何使用二进制字符串中的Wand创建高分辨率JPEG

时间:2017-12-01 22:12:25

标签: python imagemagick wand

enter image description here

我尝试使用imagemagick将一些PDF转换为高分辨率jpegs。我在win 10,64上工作,使用python 3.62 - 64位和魔杖0.4.4。在命令行我有:

$ /e/ImageMagick-6.9.9-Q16-HDRI/convert.exe -density 400 myfile.pdf -scale 2000x1000 test3.jpg.

对我来说效果很好。

在python中:

from wand.image import Image

file_path = os.path.dirname(os.path.abspath(__file__))+os.sep+"myfile.pdf"

with Image(filename=file_path, resolution=400) as image:
    image.save()
    image_jpeg = image.convert('jpeg')

这给了我低分辨率的JPEG。如何将其转换为我的魔杖代码以执行相同的操作?

编辑:

我意识到问题是输入pdf必须作为二进制字符串读入Image对象,所以基于http://docs.wand-py.org/en/0.4.4/guide/read.html#read-blob我尝试过:

with open(file_path,'rb') as f:
    image_binary = f.read()

f.close()

with Image(blob=image_binary,resolution=400) as img:
    img.transform('2000x1000', '100%')
    img.make_blob('jpeg')
    img.save(filename='out.jpg')

这在ok中读取文件,但输出被分成10个文件。为什么?我需要把它变成1个高分辨率的jpeg。

编辑:

我需要将jpeg发送到OCR api,所以我想知道是否可以将输出写入像object这样的文件。看https://www.imagemagick.org/api/magick-image.php#MagickWriteImageFile,我试过了:

emptyFile =  Image(width=1500, height=2000)

with Image(filename=file_path, resolution=400) as image:

    library.MagickResetIterator(image.wand)
    # Call C-API Append method.
    resource_pointer = library.MagickAppendImages(image.wand,
                                                  True)

    library.MagickWriteImagesFile(resource_pointer,emptyFile)

这给出了:

 File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 113, in <module>
test_file = ocr_stream(filename='test4.jpg')
 File "E:/ENVS/r3/pdfminer.six/ocr_space.py", line 96, in ocr_stream
library.MagickWriteImagesFile(resource_pointer,emptyFile)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: wrong type

我怎样才能使这个工作?

2 个答案:

答案 0 :(得分:2)

如下:

ok = Image(filename=file_path, resolution=400)
with ok.transform('2000x1000', '100%') as image:
   image.compression_quality = 100
   image.save()

或:

with ok.resize(2000, 1000)

相关:

答案 1 :(得分:2)

  

为什么呢?我需要把它变成1个高分辨率的jpeg。

PDF包含ImageMagick在“堆栈”中考虑单个图像的页面。 库提供了wand.image.Image.sequance来处理每个页面。

但是,要将所有图像附加到单个JPEG中。你可以迭代每一页&amp;将它们拼接在一起,或调用C-API的方法MagickAppendImages

from wand.image import Image
from wand.api import library
import ctypes

# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p

with Image(filename="path_to_document.pdf", resolution=400) as image:
    # Do all your preprocessing first
    # Ether word directly on the wand instance, or iterate over each page.
    # ...
    # To write all "pages" into a single image.
    # Reset the stack iterator.
    library.MagickResetIterator(image.wand)                    
    # Call C-API Append method.
    resource_pointer = library.MagickAppendImages(image.wand,
                                                  True)        
    # Write C resource directly to disk.
    library.MagickWriteImages(resource_pointer,                
                              "output.jpeg".encode("ASCII"),
                              False)

<强>更新

  

我需要将jpeg发送到OCR api ...

假设您使用OpenCV的python API,您只需要遍历每个页面,并通过numpy缓冲区将图像文件数据传递给OCR。

from wand.image import Image
import numpy
import cv2

def ocr_process(file_data_buffer):
     """ Replace with whatever your OCR-API calls for """
     mat_instance = cv2.imdecode(file_data_buffer)
     # ... work ...

source_image="path_to_document.pdf"
with Image(filename=source_image, resolution=400) as img:
    for page in img.sequence:
        file_buffer = numpy.asarray(bytearray(page.make_blob("JPEG")),
                                    dtype=numpy.uint8)
        ocr_process(file_buffer)
  

所以我想知道我是否可以将输出写入像object这样的文件

不要认为来自不同库的python“image”对象(或强调C结构)彼此相当。

在不知道OCR api的情况下,我无法帮助您超越部分,但我可以建议以下其中一个......

  • 使用临时中间文件。 (较慢的I / O,但更容易学习/开发/调试)

    with Image(filename=INPUT_PATH) as img:
        # work
        img.save(filename=OUTPUT_PATH)
    # OCR work on OUTPUT_PATH
    
  • 如果OCR API支持,请使用文件描述符。 (与上述相同)

    with open(INPUT_PATH, 'rb') as fd:
        with Image(file=fd) as img:
            # work
            # OCR work ???
    
  • 使用blob。 (更快的I / O但需要很多更多内存)

    buffer = None
    with Image(filename=INPUT_PATH) as img:
        # work
        buffer = img.make_blob(FORMAT)
    if buffer:
        # OCR work ???
    

更多更新

将所有评论结合在一起,解决方案可能是......

from wand.image import Image
from wand.api import library
import ctypes
import requests

# Map C-API not provided by wand library.
library.MagickAppendImages.argtypes = [ctypes.c_void_p, ctypes.c_int]
library.MagickAppendImages.restype = ctypes.c_void_p

with Image(filename='path_to_document.pdf', resolution=400) as image:
    # ... Do pre-processing ...
    # Reset the stack iterator.
    library.MagickResetIterator(image.wand)
    # Call C-API Append method.
    resource_pointer = library.MagickAppendImages(image.wand, True)
    # Convert to JPEG.
    library.MagickSetImageFormat(resource_pointer, b'JPEG')
    # Create size sentinel.
    length = ctypes.c_size_t()
    # Write image blob to memory.
    image_data_pointer = library.MagickGetImagesBlob(resource_pointer,
                                                     ctypes.byref(length))
    # Ensure success
    if image_data_pointer and length.value:
        # Create buffer from memory address
        payload = ctypes.string_at(image_data_pointer, length.value)
        # Define local filename.
        payload_filename = 'my_hires_image.jpg'
        # Post payload as multipart encoded image file with filename.
        requests.post(THE_URL, files={'file': (payload_filename, payload)})