PyPDF2附加文件问题

时间:2015-12-17 08:35:40

标签: python

我需要编写将图像转换为pdf并将tchem合并为一个的脚本。

我曾尝试使用img2pdf和PYPDF2,但我收到了错误。 有人可以看看,告诉我我做错了什么。

import img2pdf
import os
from PyPDF2 import PdfFileReader, PdfFileMerger, PdfFileWriter

merger = PdfFileMerger()
path = input()

for root,dir,files in os.walk(path):
        for eachfile in files:
            if "pdf" not in eachfile:
                os.chdir(root)
                PDFfile = img2pdf.convert((eachfile,), dpi=None, x=None, y=None)
                merger.append(fileobj=PDFfile)
merger.write(open("out.pdf", "wb"))

ERROR

Traceback (most recent call last):
  File "C:/Users/ms/Desktop/Desktop/test.py", line 13, in <module>
    merger.append(fileobj=PDFfile)
  File "C:\Python34\lib\site-packages\PyPDF2\merger.py", line 203, in append
    self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
  File "C:\Python34\lib\site-packages\PyPDF2\merger.py", line 133, in merge
    pdfr = PdfFileReader(fileobj, strict=self.strict)
  File "C:\Python34\lib\site-packages\PyPDF2\pdf.py", line 1065, in __init__
    self.read(stream)
  File "C:\Python34\lib\site-packages\PyPDF2\pdf.py", line 1660, in read
    stream.seek(-1, 2)
AttributeError: 'bytes' object has no attribute 'seek'

1 个答案:

答案 0 :(得分:1)

img2pdf.convert返回相应pdf文件的字节(作为字符串?),而不是文件处理程序。如果您阅读help(merger.append),您将看到需要传递文件处理程序或PDF文件的路径。这是一个可能的解决方案。也可能不生成所有中间PDF文件。

import img2pdf
import os
from PyPDF2 import PdfFileReader, PdfFileMerger, PdfFileWriter
merger = PdfFileMerger()
path = "/tmp/images"

for root,dir,files in os.walk(path):
        for eachfile in files:
            if "pdf" not in eachfile:
                os.chdir(root)
                pdfbytes = img2pdf.convert((eachfile,), dpi=None, x=None, y=None)
                pdfname = eachfile.split('.')[0]+".pdf"
                f = open(pdfname, 'wb+')
                f.write(pdfbytes)
                merger.append(fileobj=f)
                f.close()

f = open("out.pdf", "wb")
merger.write(f)
f.close()

顺便说一句,使用convertpdfjampdftk等常规工具也会更加简单。