PyPDF2,设置PDF版本

时间:2015-03-16 20:45:38

标签: python pypdf

我正在使用PyPDF2 1.4和Python 2.7:

如何将PDF版本从输入文件更改为输出文件?

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import NameObject, createStringObject

input_filename = 'my_input_filename.pdf'

# Read input PDF file
inputPDF = PdfFileReader(open(input_filename, 'rb'))
info = inputPDF.documentInfo

for i in xrange(inputPDF.numPages):
    # Create output PDF
    outputPDF = PdfFileWriter()
    # Create dictionary for output PDF
    infoDict = outputPDF._info.getObject()
    # Update output PDF metadata with input PDF metadata
    for key in info:
        infoDict.update({NameObject(key): createStringObject(info[key])})
    outputPDF.addPage(inputPDF.getPage(i))

with open(output_filename , 'wb') as outputStream:
    outputPDF.write(outputStream)

2 个答案:

答案 0 :(得分:1)

当前版本中的PyPDF2除了带有PDF1.3标题的文件外,不会生成任何内容。来自the official source code:     class PdfFileWriter(object):

    """
    This class supports writing PDF files out, given pages produced by another
    class (typically :class:`PdfFileReader<PdfFileReader>`).
    """
    def __init__(self):
        self._header = b_("%PDF-1.3")
        ...

如果那是 legal ,考虑到它能让你输入&gt; 1.3的东西,这是值得怀疑的。

如果您只想修改标题中的版本字符串(我不知道会产生哪些后果,所以我假设您对PDF标准的了解比我更多!)

from PyPDF2.utils import b_
...
outputPDF._header.replace(b_("PDF-1.3"),b_("PDF-1.5"))

或类似的东西。

答案 1 :(得分:1)

要添加到Marcus&#39;回答如上:

现在 - 当Marcus写他的帖子时,我无法说话)没有什么能阻止你使用标准PyPDF2 addMetadata函数指定元数据中的版本。下面的示例使用PdfFileMerger(因为我最近正在对现有文件上的PDF元数据进行一些清理),但PdfFileWriter具有相同的功能:

from PyPDF2 import PdfFileMerger

# Define file input/output, and metadata containing version string.
# Using separate input/output files, since it's worth keeping a copy of the originals!
fileIn = 'foo.pdf'
fileOut = 'bar.pdf'
metadata = {
    u'/Version': 'PDF-1.5'
}

# Set up PDF file merger, copy existing file contents into merger object.
merger = PdfFileMerger()

with open( fileIn, 'rb') as fh_in:
    merger.append(fh_in)

# Append metadata to PDF content in merger.
merger.addMetadata(metadata)

# Write new PDF file with appended metadata to output
# CAUTION: This will overwrite any existing files without prompt!
with open( fileOut, 'wb' ) as fh_out:
    merger.write(fh_out)