Question

此link显示了如何将Set-ExecutionPolicy RemoteSigned转换为图像。有没有办法在转换成图像之前先缩放pdf？在我的项目中，我将pdf转换为pdf，然后使用png库提取文本。我注意到，如果我缩放Python-tesseract，然后将零件另存为pdf，则OCR会提供更好的结果。那么有没有办法在转换为png之前先缩放pdf？

Answer 1

我认为提高图像的质量（分辨率）比放大pdf更好。

使用pdf2image可以很容易地做到这一点：

安装pdf2image：pip install pdf2image

然后在python中将您的pdf转换为高质量的图像：

from pdf2image import convert_from_path

pages = convert_from_path('sample.pdf', 400) //400 is the Image quality in DPI (default 200)

pages[0].save("sample.png")

通过使用quality参数，您应该得到想要的结果

将pdf转换为图像，但放大后

1 个答案: