我试图将所有pdf连接成一个pdf,从而使用PyPDF2库。 我正在使用python 2.7。
我的错误是:
>>>
RESTART: C:\Users\Yash gupta\Desktop\first projectt\concatenate\test\New folder\test.py
['Invoice.pdf', 'Invoice_2.pdf', 'invoice_3.pdf', 'last.pdf']
Traceback (most recent call last):
File "C:\Users\Yash gupta\Desktop\first projectt\concatenate\test\New folder\test.py", line 17, in <module>
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 1084, in __init__
self.read(stream)
File "C:\Python27\lib\site-packages\PyPDF2\pdf.py", line 1689, in read
stream.seek(-1, 2)
IOError: [Errno 22] Invalid argument
我的代码是:
import PyPDF2, os
# Get all the PDF filenames.
pdfFiles = []
for filename in os.listdir('.'):
if filename.endswith('.pdf'):
pdfFiles.append(filename)
pdfFiles.sort(key=str.lower)
pdfWriter = PyPDF2.PdfFileWriter()
print ( pdfFiles)
# Loop through all the PDF files.
for filename in pdfFiles:
pdfFileObj = open(filename, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
print ( pdfFileObj )
# Loop through all the pages
for pageNum in range(0, pdfReader.numPages):
pageObj = pdfReader.getPage(pageNum)
pdfWriter.addPage(pageObj)
# Save the resulting PDF to a file.
pdfOutput = open('last.pdf', 'wb')
pdfWriter.write(pdfOutput)
pdfOutput.close()
我的pdf有一些非ASCII字符,所以我正在使用&#39; r&#39;然后,然后&#39; rb&#39;
PS:我是Python新手和所有这些库的东西
答案 0 :(得分:1)
我相信你是在错误地循环收集文件(Python是缩进敏感的)。
# Loop through all the PDF files.
for filename in pdfFiles:
pdfFileObj = open(filename, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
# Loop through all the pages
for pageNum in range(0, pdfReader.numPages):
pageObj = pdfReader.getPage(pageNum)
pdfWriter.addPage(pageObj)
# Save the resulting PDF to a file.
pdfOutput = open('last.pdf', 'wb')
pdfWriter.write(pdfOutput)
pdfOutput.close()
另外,如果要合并PDF文件,请尝试使用PdfFileMerger
:
merger = PdfFileMerger(strict=False)