Python PyPDF2组合文档

时间:2014-07-03 18:04:00

标签: python

我正在尝试获取两个PDF文档,从doc1中删除第一页并从doc2添加第1页。我想我的代码主要是它需要的地方,但我收到的是“ValueError:格式不完整”。这是我的代码,它给我的错误是第三行到最后一行。

import os, shutil, sys, PyPDF2
from PyPDF2 import PdfFileWriter, PdfFileReader

origList = list()
for root, dirs, files in os.walk(r'C:\Users\User1\Desktop\pytest\orig'):
    for file in files:
            origList.append(file)
print "Original PDF list:"
print origList

stampList = list()
for root, dirs, files in os.walk(r'C:\Users\User1\Desktop\pytest\stamp'):
    for file in files:
            stampList.append(file)
print "Stamped PDF List:"
print stampList

oX = 0
output = PdfFileWriter()
for document in origList:
    file1 = PdfFileReader(open(r'C:\Users\User1\Desktop\pytest\orig\%s' %origList[oX], "rb"))
    file2 = PdfFileReader(open(r'C:\Users\User1\Desktop\pytest\stamp\%s' %stampList[oX], "rb"))
    file1.decrypt("")
    curFile = origList[oX]
    output.addPage(file2.getPage(0))
    file2Pages = file2.getNumPages()
    file2Counter = 1
    while file2Counter <= file2Pages:
        output.addPage(file1.getPage(file2Counter))
        file2Counter = file2Counter + 1
    outputStream = file(r'C:\Users\User1\Desktop\pytest\output\%' %curFile, "wb")
    output.write(outputStream)
    oX = oX + 1

错误的行是:

outputStream = file(r'C:\Users\User1\Desktop\pytest\output\%' %curFile, "wb")

我按照模块给出的示例,我认为应该如何编写。

1 个答案:

答案 0 :(得分:0)

在给出错误的行上,您缺少格式说明符,即

outputStream = file(r'C:\Users\User1\Desktop\pytest\output\%' %curFile, "wb")
                                                            ^
         # what type are we formatting? Need to add an 's' here for 'string'

与上一行比较...

file1 = PdfFileReader(open(r'C:\Users\User1\Desktop\pytest\orig\%s' %origList[oX], "rb"))
                                                                 ^
                              # the output line is missing this 's'