使用PyPDF2更新可填充的pdf

时间:2019-11-17 07:51:23

标签: python pdf pypdf2

我无法更新可填充pdf中的命名字段。 我的代码如下所示:

from PyPDF2 import PdfFileWriter, PdfFileReader

myfile = PdfFileReader("invoice_template.pdf")
first_page = myfile.getPage(0)

writer = PdfFileWriter()

data_dict = {
            'business_name_1': 'Consulting',
            'customer_name': 'company.io',
            'customer_email': 'example@icloud.com'
            }

writer.updatePageFormFieldValues(first_page, fields=data_dict)
writer.addPage(first_page)

with open("newfile.pdf","wb") as new:
    writer.write(new)

在调用myfile.getFormTextFields()之前和之后,我已经使用updatePageFormFieldValues()检查了领域字典,它们确实得到了更新。但是,生成的pdf中没有任何字段值。不知道我在做什么错。我正在使用的pdf文件可以找到here

1 个答案:

答案 0 :(得分:1)

通过将PDF的NeedAppearances值设置为True可以解决此问题。这可以通过一个函数来完成:

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

然后,您只需在行set_need_appearances_writer(writer)之后添加行writer = PdfFileWriter(),就可以更新表单了!

您可以在此处查看更多信息:https://github.com/mstamy2/PyPDF2/issues/355

固定代码

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject

def set_need_appearances_writer(writer: PdfFileWriter):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)
            })

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        # del writer._root_object["/AcroForm"]['NeedAppearances']
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

myfile = PdfFileReader("invoice_template.pdf")
first_page = myfile.getPage(0)

writer = PdfFileWriter()
set_need_appearances_writer(writer)

data_dict = {
            'business_name_1': 'Consulting',
            'customer_name': 'company.io',
            'customer_email': 'example@icloud.com'
            }

writer.updatePageFormFieldValues(first_page, fields=data_dict)
writer.addPage(first_page)

with open("newfile.pdf","wb") as new:
    writer.write(new)