我想创建一个脚本来读取目录中的所有pdf
个文件,复制每个文件的第二页并将其写入一个输出pdf(包含所有秒页)。
我已经写了一个代码,但它给了我一个带有空白页面的pdf。这真的很奇怪,因为我有另一个代码,它取每个pdf的第二页,并为每个第二页创建一个新的pdf,并且该代码有效。我认为我的问题可能与addPage()
有关
我正在使用PyPDF2库来使用pdf文件。
import pathlib
from PyPDF2 import PdfFileReader, PdfFileWriter
files_list = [file for file in pathlib.Path(__file__).parent.iterdir() if (file.is_file() and not str(file).endswith(".py"))]
total = len(files_list)
writer = PdfFileWriter()
for file in files_list:
with open(file, 'rb') as infile:
reader = PdfFileReader(infile)
reader.decrypt("")
writer.addPage(reader.getPage(1))
with open('Output.pdf', 'wb') as outfile:
writer.write(outfile)
print('Done.')
答案 0 :(得分:0)
查看PdfFileMerger.append - 它允许您将多个pdf中的页面合并为一个结果文件。
append(fileobj, bookmark=None, pages=None, import_bookmarks=True)
与merge()方法相同,但假设您希望将所有页面连接到文件末尾而不是指定位置。
Parameters: fileobj A File Object or an object that supports the standard read and seek methods similar to a File Object. Could also be a string representing a path to a PDF file. bookmark (str) Optionally, you may specify a bookmark to be applied at the beginning of the included file by supplying the text of the bookmark. pages can be a Page Range or a (start, stop[, step]) tuple to merge only the specified range of pages from the source document into the output document. import_bookmarks (bool) You may prevent the source document’s bookmarks from being imported by specifying this as False.
这似乎更适合您使用PdfFileWriter
进行的操作。
from PyPDF2 import PdfFileMerger, PdfFileReader # ... merger = PdfFileMerger() merger.append(PdfFileReader(file(filename1, 'rb')),None, [2]) merger.append(PdfFileReader(file(filename2, 'rb')),None, [2]) merger.write("document-output.pdf")
答案 1 :(得分:0)
您是否尝试过以下代码:https://www.randomhacks.co.uk/how-to-split-a-pdf-every-2-pages-using-python/
from pyPdf import PdfFileWriter, PdfFileReader
import glob
import sys
pdfs = glob.glob("*.pdf")
for pdf in pdfs:
inputpdf = PdfFileReader(file(pdf, "rb"))
for i in range(inputpdf.numPages // 2):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(i * 2))
if i * 2 + 1 < inputpdf.numPages:
output.addPage(inputpdf.getPage(i * 2 + 1))
newname = pdf[:7] + "-" + str(i) + ".pdf"
outputStream = file(newname, "wb")
output.write(outputStream)
outputStream.close()