Question

我有一个函数，它将PDF文件路径作为输入，并将其拆分为单独的页面，如下所示：

import os,time
from pyPdf import PdfFileReader, PdfFileWriter

def split_pages(file_path):
    print("Splitting the PDF")
    temp_path = os.path.join(os.path.abspath(__file__), "temp_"+str(int(time.time())))
    if not os.path.exists(temp_path):
        os.makedirs(temp_path)
    inputpdf = PdfFileReader(open(file_path, "rb"))
    if inputpdf.getIsEncrypted():
        inputpdf.decrypt('')
    for i in xrange(inputpdf.numPages):
        output = PdfFileWriter()
        output.addPage(inputpdf.getPage(i))
        with open(os.path.join(temp_path,'%s.pdf'% i),"wb") as outputStream:
            output.write(outputStream)

它适用于小文件，但问题是当PDF有超过152页时，它只会拆分前0-151页，之后会停止。在我杀死它之前，它还会耗尽系统的所有内存。

请让我知道我做错了什么或问题出在哪里以及如何纠正？

Answer 1

似乎问题出在 pyPdf 本身。我切换到 pyPDF2 并且有效。

pyPdf分割150-152页的PDF

1 个答案: