我尝试使用PyPDF2合并到python中的pdf文件。
问题是文件大小。
那么有没有其他方法来合并没有文件大小限制和内存问题的文件?
文件大小1 = 900MB
文件大小2 = 300MB
我的理解。有没有办法只加载第一个pdf的最后一页并附上第二个pdf?
我的代码:
from PyPDF2 import PdfFileMerger, PdfFileReader
merger = PdfFileMerger()
filename1 = 'document-output3.pdf'
filename2 = 'file1.pdf'
merger.append(PdfFileReader(open(filename1, 'rb')))
merger.append(PdfFileReader(open(filename2, 'rb')))
merger.write("document-output3.pdf")
- 错误讯息 -
Traceback (most recent call last): File "C:\Users\USERNAME\eclipse-workspace\PyPDF2\MergePDF\mergepdf.py", line 13, in <module>
merger.write("document-output5.pdf") File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\merger.py", line 230, in write
self.output.write(fileobj) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 482, in write
self._sweepIndirectReferences(externalReferenceMap, self._root) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i]) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 586, in _sweepIndirectReferences
newobj = self._sweepIndirectReferences(externMap, newobj) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences
newobj = data.pdf.getObject(data) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject
retval = readObject(self.stream, self) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf) File "C:\Users\USERNAME\AppData\Local\Programs\Python\Python36-32\lib\site-packages\PyPDF2\generic.py", line 611, in readFromStream
data["__streamdata__"] = stream.read(length) MemoryError
感谢您的关注,
费边