test.pdf:"你好"
tomerge1.pdf:" 1"
tomerge2.pdf:" 2"
在output.pdf
中,我希望:
test.pdf
的第1页与tomerge1.pdf
的第1页合并,即" Hello 1" test.pdf
的第1页与tomerge2.pdf
的第1页合并,即" Hello 2" 以下是我使用的内容:
from PyPDF2 import PdfFileWriter, PdfFileReader
outputpdf = PdfFileWriter()
inputpdf = PdfFileReader(open("test.pdf", "rb"))
tomerge1 = PdfFileReader(open("tomerge1.pdf", "rb"))
tomerge2 = PdfFileReader(open("tomerge2.pdf", "rb"))
page = inputpdf.getPage(0)
page.mergePage(tomerge1.getPage(0))
outputpdf.addPage(page)
# exit()
# if we stop here, the output is "Hello 1", which is good
# Why isn't "Hello 1" remembered here?
# del page # doesn't change anything
page = inputpdf.getPage(0)
page.mergePage(tomerge2.getPage(0))
outputpdf.addPage(page)
with open("output.pdf", "wb") as f:
outputpdf.write(f)
可悲的是,它不起作用:而不是拥有" Hello 1" /" Hello 2",输出为: "你好2" /"你好2"。
问题:如何获得预期的行为?(当有10或20页时没有大小增长很快)
答案 0 :(得分:1)
我发现当我做类似的练习时你需要阅读一次并合并一次。这样做的方法是为两个读者的输入文件(" test.pdf")合并设置两个读者。示例代码如下:
addressfile = open("Documents/addresses.pdf","rb")
xwfile = "Downloads/input.pdf"
crosswordfile = open(xwfile,"rb")
xword = PdfFileReader(crosswordfile)
xw2 = PdfFileReader(crosswordfile)
addr = PdfFileReader(addressfile)
xwpage = xword.getPage(0)
addpage1 = addr.getPage(1)
addpage2 = addr.getPage(2)
pdfWriter = PdfFileWriter()
xp2 = xw2.getPage(0)
xwpage.mergePage(addpage1)
xp2.mergePage(addpage2)
res = open("/home/paula/xw.pdf",'wb')
pdfWriter.addPage(xwpage)
pdfWriter.addPage(xp2)
pdfWriter.write(res)
res.close()
crosswordfile.close()
所以在你的代码中这是:
testfile = open("test.pdf", "rb")
outputpdf = PdfFileWriter()
inputpdf1 = PdfFileReader(testfile)
inputpdf2 = PdfFileReader(testfile)
tomerge1 = PdfFileReader(open("tomerge1.pdf", "rb"))
tomerge2 = PdfFileReader(open("tomerge2.pdf", "rb"))
page1 = inputpdf1.getPage(0)
page1.mergePage(tomerge1.getPage(0))
outputpdf.addPage(page1)
# exit()
# No need stop here, the output will have both "Hello 1" and "Hello 2"
# Using two readers for the same file fools PyPdf2 into thinking they
# are two different files, i.e. that we are merging from two sperate sources
page2 = inputpdf2.getPage(0)
page2.mergePage(tomerge2.getPage(0))
outputpdf.addPage(page2)
with open("output.pdf", "wb") as f:
outputpdf.write(f)